Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion .agents/skills/nemoclaw-configure-inference/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,14 @@ The sandbox does not receive your API key.
## Provider Options

The onboard wizard presents the following provider options by default.
The first six are always available.
The first seven are always available.
Ollama appears when it is installed or running on the host.

| Option | Description | Curated models |
|--------|-------------|----------------|
| NVIDIA Endpoints | Routes to models hosted on [build.nvidia.com](https://build.nvidia.com). You can also enter any model ID from the catalog. Set `NVIDIA_API_KEY`. | Nemotron 3 Super 120B, Kimi K2.5, GLM-5, MiniMax M2.5, GPT-OSS 120B |
| OpenAI | Routes to the OpenAI API. Set `OPENAI_API_KEY`. | `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, `gpt-5.4-pro-2026-03-05` |
| Azure OpenAI | Routes to an Azure OpenAI deployment. The wizard prompts for your resource endpoint URL (`https://<resource>.openai.azure.com/v1`) and a deployment model name. Set `AZURE_OPENAI_API_KEY`. | You provide the deployment model name. |
| Other OpenAI-compatible endpoint | Routes to any server that implements `/v1/chat/completions`. If the endpoint also supports `/responses` with OpenClaw-style tool calling, NemoClaw can use that path; otherwise it falls back to `/chat/completions`. The wizard prompts for a base URL and model name. Works with OpenRouter, LocalAI, llama.cpp, or any compatible proxy. Set `COMPATIBLE_API_KEY`. | You provide the model name. |
| Anthropic | Routes to the Anthropic Messages API. Set `ANTHROPIC_API_KEY`. | `claude-sonnet-4-6`, `claude-haiku-4-5`, `claude-opus-4-6` |
| Other Anthropic-compatible endpoint | Routes to any server that implements the Anthropic Messages API (`/v1/messages`). The wizard prompts for a base URL and model name. Set `COMPATIBLE_ANTHROPIC_API_KEY`. | You provide the model name. |
Expand All @@ -57,13 +58,16 @@ If validation fails, the wizard returns to provider selection.
| Provider type | Validation method |
|---|---|
| OpenAI | Tries `/responses` first, then `/chat/completions`. |
| Azure OpenAI | Tries `/responses` first with a tool-calling probe. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| NVIDIA Endpoints | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| Google Gemini | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| Other OpenAI-compatible endpoint | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| Anthropic-compatible | Tries `/v1/messages`. |
| NVIDIA Endpoints (manual model entry) | Validates the model name against the catalog API. |
| Compatible endpoints | Sends a real inference request because many proxies do not expose a `/models` endpoint. For OpenAI-compatible endpoints, the probe includes tool calling before NemoClaw favors `/responses`. |

*Full details in `references/inference-options.md`.*

## Prerequisites

- A running NemoClaw sandbox.
Expand Down Expand Up @@ -91,6 +95,12 @@ $ openshell inference set --provider nvidia-prod --model nvidia/nemotron-3-super
$ openshell inference set --provider openai-api --model gpt-5.4
```

### Azure OpenAI

```console
$ openshell inference set --provider azure-openai --model <deployment-name>
```

### Anthropic

```console
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,14 @@ The sandbox does not receive your API key.
## Provider Options

The onboard wizard presents the following provider options by default.
The first six are always available.
The first seven are always available.
Ollama appears when it is installed or running on the host.

| Option | Description | Curated models |
|--------|-------------|----------------|
| NVIDIA Endpoints | Routes to models hosted on [build.nvidia.com](https://build.nvidia.com). You can also enter any model ID from the catalog. Set `NVIDIA_API_KEY`. | Nemotron 3 Super 120B, Kimi K2.5, GLM-5, MiniMax M2.5, GPT-OSS 120B |
| OpenAI | Routes to the OpenAI API. Set `OPENAI_API_KEY`. | `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, `gpt-5.4-pro-2026-03-05` |
| Azure OpenAI | Routes to an Azure OpenAI deployment. The wizard prompts for your resource endpoint URL (`https://<resource>.openai.azure.com/v1`) and a deployment model name. Set `AZURE_OPENAI_API_KEY`. | You provide the deployment model name. |
| Other OpenAI-compatible endpoint | Routes to any server that implements `/v1/chat/completions`. If the endpoint also supports `/responses` with OpenClaw-style tool calling, NemoClaw can use that path; otherwise it falls back to `/chat/completions`. The wizard prompts for a base URL and model name. Works with OpenRouter, LocalAI, llama.cpp, or any compatible proxy. Set `COMPATIBLE_API_KEY`. | You provide the model name. |
| Anthropic | Routes to the Anthropic Messages API. Set `ANTHROPIC_API_KEY`. | `claude-sonnet-4-6`, `claude-haiku-4-5`, `claude-opus-4-6` |
| Other Anthropic-compatible endpoint | Routes to any server that implements the Anthropic Messages API (`/v1/messages`). The wizard prompts for a base URL and model name. Set `COMPATIBLE_ANTHROPIC_API_KEY`. | You provide the model name. |
Expand All @@ -48,6 +49,7 @@ If validation fails, the wizard returns to provider selection.
| Provider type | Validation method |
|---|---|
| OpenAI | Tries `/responses` first, then `/chat/completions`. |
| Azure OpenAI | Tries `/responses` first with a tool-calling probe. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| NVIDIA Endpoints | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| Google Gemini | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| Other OpenAI-compatible endpoint | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ The following endpoint groups are allowed by default:
- `/usr/local/bin/claude`, `/usr/local/bin/openclaw`
- All methods

* - `azure_openai`
- `*.openai.azure.com:443`
- `/usr/local/bin/claude`, `/usr/local/bin/openclaw`
- POST on `/openai/deployments/*/chat/completions`, `/openai/deployments/*/completions`, `/openai/deployments/*/embeddings`; GET on `/openai/deployments`, `/openai/deployments/**`, `/openai/models`, `/openai/models/**`

* - `github`
- `github.com:443`
- `/usr/bin/gh`, `/usr/bin/git`
Expand Down
43 changes: 41 additions & 2 deletions bin/lib/onboard.js
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,17 @@ const REMOTE_PROVIDER_CONFIG = {
defaultModel: "gemini-2.5-flash",
skipVerify: true,
},
azureOpenAi: {
label: "Azure OpenAI",
providerName: "azure-openai",
providerType: "openai",
credentialEnv: "AZURE_OPENAI_API_KEY",
endpointUrl: "",
helpUrl: "https://portal.azure.com/",
modelMode: "input",
defaultModel: "gpt-4o",
skipVerify: true,
},
custom: {
label: "Other OpenAI-compatible endpoint",
providerName: "compatible-endpoint",
Expand Down Expand Up @@ -865,6 +876,7 @@ function getSandboxInferenceConfig(model, provider = null, preferredInferenceApi

switch (provider) {
case "openai-api":
case "azure-openai":
providerKey = "openai";
primaryModelRef = `openai/${model}`;
break;
Expand Down Expand Up @@ -2442,6 +2454,7 @@ async function setupNim(gpu) {
const options = [];
options.push({ key: "build", label: "NVIDIA Endpoints" });
options.push({ key: "openai", label: "OpenAI" });
options.push({ key: "azureOpenAi", label: "Azure OpenAI" });
options.push({ key: "custom", label: "Other OpenAI-compatible endpoint" });
options.push({ key: "anthropic", label: "Anthropic" });
options.push({ key: "anthropicCompatible", label: "Other Anthropic-compatible endpoint" });
Expand Down Expand Up @@ -2513,7 +2526,31 @@ async function setupNim(gpu) {
endpointUrl = remoteConfig.endpointUrl;
preferredInferenceApi = null;

if (selected.key === "custom") {
if (selected.key === "azureOpenAi") {
const endpointInput = isNonInteractive()
? (process.env.NEMOCLAW_ENDPOINT_URL || "").trim()
: await prompt(
" Azure OpenAI endpoint URL (e.g., https://my-resource.openai.azure.com/v1): ",
);
const navigation = getNavigationChoice(endpointInput);
if (navigation === "back") {
console.log(" Returning to provider selection.");
console.log("");
continue selectionLoop;
}
if (navigation === "exit") {
exitOnboardFromPrompt();
}
endpointUrl = normalizeProviderBaseUrl(endpointInput, "openai");
if (!endpointUrl) {
console.error(" Endpoint URL is required for Azure OpenAI.");
if (isNonInteractive()) {
process.exit(1);
}
console.log("");
continue selectionLoop;
}
} else if (selected.key === "custom") {
const endpointInput = isNonInteractive()
? (process.env.NEMOCLAW_ENDPOINT_URL || "").trim()
: await prompt(" OpenAI-compatible base URL (e.g., https://openrouter.ai/api/v1): ");
Expand Down Expand Up @@ -2637,7 +2674,7 @@ async function setupNim(gpu) {
continue selectionLoop;
}

if (selected.key === "custom") {
if (selected.key === "azureOpenAi" || selected.key === "custom") {
const validation = await validateCustomOpenAiLikeSelection(
remoteConfig.label,
endpointUrl,
Expand Down Expand Up @@ -3029,6 +3066,7 @@ async function setupInference(
provider === "nvidia-prod" ||
provider === "nvidia-nim" ||
provider === "openai-api" ||
provider === "azure-openai" ||
provider === "anthropic-prod" ||
provider === "compatible-anthropic-endpoint" ||
provider === "gemini-api" ||
Expand Down Expand Up @@ -3843,6 +3881,7 @@ function printDashboard(sandboxName, model, provider, nimContainer = null) {
let providerLabel = provider;
if (provider === "nvidia-prod" || provider === "nvidia-nim") providerLabel = "NVIDIA Endpoints";
else if (provider === "openai-api") providerLabel = "OpenAI";
else if (provider === "azure-openai") providerLabel = "Azure OpenAI";
else if (provider === "anthropic-prod") providerLabel = "Anthropic";
else if (provider === "compatible-anthropic-endpoint")
providerLabel = "Other Anthropic-compatible endpoint";
Expand Down
6 changes: 4 additions & 2 deletions docs/inference/inference-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ status: published
---

<!--
SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0
-->

Expand All @@ -38,13 +38,14 @@ The sandbox does not receive your API key.
## Provider Options

The onboard wizard presents the following provider options by default.
The first six are always available.
The first seven are always available.
Ollama appears when it is installed or running on the host.

| Option | Description | Curated models |
|--------|-------------|----------------|
| NVIDIA Endpoints | Routes to models hosted on [build.nvidia.com](https://build.nvidia.com). You can also enter any model ID from the catalog. Set `NVIDIA_API_KEY`. | Nemotron 3 Super 120B, Kimi K2.5, GLM-5, MiniMax M2.5, GPT-OSS 120B |
| OpenAI | Routes to the OpenAI API. Set `OPENAI_API_KEY`. | `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, `gpt-5.4-pro-2026-03-05` |
| Azure OpenAI | Routes to an Azure OpenAI deployment. The wizard prompts for your resource endpoint URL (`https://<resource>.openai.azure.com/v1`) and a deployment model name. Set `AZURE_OPENAI_API_KEY`. | You provide the deployment model name. |
| Other OpenAI-compatible endpoint | Routes to any server that implements `/v1/chat/completions`. If the endpoint also supports `/responses` with OpenClaw-style tool calling, NemoClaw can use that path; otherwise it falls back to `/chat/completions`. The wizard prompts for a base URL and model name. Works with OpenRouter, LocalAI, llama.cpp, or any compatible proxy. Set `COMPATIBLE_API_KEY`. | You provide the model name. |
| Anthropic | Routes to the Anthropic Messages API. Set `ANTHROPIC_API_KEY`. | `claude-sonnet-4-6`, `claude-haiku-4-5`, `claude-opus-4-6` |
| Other Anthropic-compatible endpoint | Routes to any server that implements the Anthropic Messages API (`/v1/messages`). The wizard prompts for a base URL and model name. Set `COMPATIBLE_ANTHROPIC_API_KEY`. | You provide the model name. |
Expand All @@ -70,6 +71,7 @@ If validation fails, the wizard returns to provider selection.
| Provider type | Validation method |
|---|---|
| OpenAI | Tries `/responses` first, then `/chat/completions`. |
| Azure OpenAI | Tries `/responses` first with a tool-calling probe. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| NVIDIA Endpoints | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| Google Gemini | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
| Other OpenAI-compatible endpoint | Tries `/responses` first with a tool-calling probe that matches OpenClaw behavior. Falls back to `/chat/completions` if the endpoint does not return a compatible tool call. |
Expand Down
6 changes: 6 additions & 0 deletions docs/inference/switch-inference-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,12 @@ $ openshell inference set --provider nvidia-prod --model nvidia/nemotron-3-super
$ openshell inference set --provider openai-api --model gpt-5.4
```

### Azure OpenAI

```console
$ openshell inference set --provider azure-openai --model <deployment-name>
```

### Anthropic

```console
Expand Down
5 changes: 5 additions & 0 deletions docs/reference/network-policies.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,11 @@ The following endpoint groups are allowed by default:
- `/usr/local/bin/claude`, `/usr/local/bin/openclaw`
- All methods

* - `azure_openai`
- `*.openai.azure.com:443`
- `/usr/local/bin/claude`, `/usr/local/bin/openclaw`
- POST on `/openai/deployments/*/chat/completions`, `/openai/deployments/*/completions`, `/openai/deployments/*/embeddings`; GET on `/openai/deployments`, `/openai/deployments/**`, `/openai/models`, `/openai/models/**`

* - `github`
- `github.com:443`
- `/usr/bin/gh`, `/usr/bin/git`
Expand Down
20 changes: 20 additions & 0 deletions nemoclaw-blueprint/policies/openclaw-sandbox.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,26 @@ network_policies:
- { path: /usr/local/bin/claude }
- { path: /usr/local/bin/openclaw }

azure_openai:
name: azure_openai
endpoints:
- host: "*.openai.azure.com"
port: 443
protocol: rest
enforcement: enforce
tls: terminate
rules:
- allow: { method: POST, path: "/openai/deployments/*/chat/completions" }
- allow: { method: POST, path: "/openai/deployments/*/completions" }
- allow: { method: POST, path: "/openai/deployments/*/embeddings" }
- allow: { method: GET, path: "/openai/deployments" }
- allow: { method: GET, path: "/openai/deployments/**" }
- allow: { method: GET, path: "/openai/models" }
- allow: { method: GET, path: "/openai/models/**" }
binaries:
- { path: /usr/local/bin/claude }
- { path: /usr/local/bin/openclaw }

github:
name: github
endpoints:
Expand Down
16 changes: 15 additions & 1 deletion src/lib/inference-config.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,19 @@ describe("inference selection config", () => {
});
});

it("maps azure-openai to the sandbox inference route", () => {
expect(getProviderSelectionConfig("azure-openai", "gpt-4o")).toEqual({
endpointType: "custom",
endpointUrl: INFERENCE_ROUTE_URL,
ncpPartner: null,
model: "gpt-4o",
profile: DEFAULT_ROUTE_PROFILE,
credentialEnv: "AZURE_OPENAI_API_KEY",
provider: "azure-openai",
providerLabel: "Azure OpenAI",
});
});

it("maps the remaining hosted providers to the sandbox inference route", () => {
// Full-object assertion for one hosted provider to catch structural regressions
expect(getProviderSelectionConfig("openai-api", "gpt-5.4-mini")).toEqual({
Expand Down Expand Up @@ -114,6 +127,7 @@ describe("inference selection config", () => {
"nvidia-prod",
"nvidia-nim",
"openai-api",
"azure-openai",
"anthropic-prod",
"compatible-anthropic-endpoint",
"gemini-api",
Expand All @@ -128,7 +142,6 @@ describe("inference selection config", () => {
"bedrock",
"vertex",
"azure",
"azure-openai",
"deepseek",
"mistral",
"cohere",
Expand All @@ -147,6 +160,7 @@ describe("inference selection config", () => {

it("falls back to provider defaults when model is omitted", () => {
expect(getProviderSelectionConfig("openai-api")?.model).toBe("gpt-5.4");
expect(getProviderSelectionConfig("azure-openai")?.model).toBe("gpt-4o");
expect(getProviderSelectionConfig("anthropic-prod")?.model).toBe("claude-sonnet-4-6");
expect(getProviderSelectionConfig("gemini-api")?.model).toBe("gemini-2.5-flash");
expect(getProviderSelectionConfig("compatible-endpoint")?.model).toBe("custom-model");
Expand Down
13 changes: 13 additions & 0 deletions src/lib/inference-config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,10 @@ export interface GatewayInference {
model: string | null;
}

/**
* Return the inference routing config for a known provider, or null
* if the provider ID is not in the approved set.
*/
export function getProviderSelectionConfig(
provider: string,
model?: string,
Expand Down Expand Up @@ -87,6 +91,13 @@ export function getProviderSelectionConfig(
credentialEnv: "GEMINI_API_KEY",
providerLabel: "Google Gemini",
};
case "azure-openai":
return {
...base,
model: model || "gpt-4o",
credentialEnv: "AZURE_OPENAI_API_KEY",
providerLabel: "Azure OpenAI",
};
case "compatible-endpoint":
return {
...base,
Expand All @@ -113,12 +124,14 @@ export function getProviderSelectionConfig(
}
}

/** Build the qualified `<managed-provider>/<model>` ref used by OpenClaw. */
export function getOpenClawPrimaryModel(provider: string, model?: string): string {
const resolvedModel =
model || (provider === "ollama-local" ? DEFAULT_OLLAMA_MODEL : DEFAULT_CLOUD_MODEL);
return `${MANAGED_PROVIDER_ID}/${resolvedModel}`;
}

/** Parse provider and model from `openshell inference get` CLI output. */
export function parseGatewayInference(output: string | null | undefined): GatewayInference | null {
if (!output) return null;
// eslint-disable-next-line no-control-regex
Expand Down
Loading