NVIDIA NIM Provider

Access AI models through NVIDIA NIM (Inference Microservices), supporting both NVIDIA's hosted API at build.nvidia.com and self-hosted NIM containers. Uses the OpenAI-compatible chat completions API.

Provider Name	`nim`
Module	`shared.plugins.model_provider.nim`
SDK	`openai` (OpenAI-compatible API)
Auth	API key (hosted) or none (self-hosted)

Highlights

NIM Catalog — Access Llama, DeepSeek-R1, Nemotron, Mistral, and more
Self-hosted support — Run NIM containers locally with no API key
Reasoning models — DeepSeek-R1 with chain-of-thought via reasoning_content
Function calling — Full tool use support via OpenAI-compatible API
Streaming — Real-time streaming with cancellation support

OpenAI Compatibility

NIM exposes an OpenAI-compatible chat completions API. The provider uses the openai Python SDK as its transport layer, making it compatible with any model in the NIM catalog.

Quick start

from jaato import JaatoClient

client = JaatoClient(provider_name="nim")
client.connect(
    project=None,
    location=None,
    model="meta/llama-3.3-70b-instruct"
)
client.configure_tools(registry)

response = client.send_message(
    "Hello from NIM!",
    on_output=on_output
)

Self-hosted NIM container

from shared import load_provider, ProviderConfig

provider = load_provider("nim")
provider.initialize(ProviderConfig(
    extra={
        'base_url': 'http://localhost:8000/v1',
    }
))
# No API key needed for self-hosted
provider.connect("meta/llama-3.1-8b-instruct")

response = provider.send_message("Hello!")
print(response.text)

Available Models

NIM hosts a large catalog of models. Below are some popular options. See build.nvidia.com for the full catalog.

Popular Models

Model	Reasoning	Notes
meta/llama-3.3-70b-instruct	No	Strong general-purpose reasoning
meta/llama-3.1-405b-instruct	No	Largest open model
meta/llama-3.1-70b-instruct	No	Good balance of speed and quality
meta/llama-3.1-8b-instruct	No	Fast, lightweight
deepseek/deepseek-r1	Yes	Chain-of-thought reasoning
nvidia/llama-3.1-nemotron-70b-instruct	No	NVIDIA-tuned Llama

Model Naming

NIM models use the format org/model-name. When using with jaato, prefix with nim/ for provider routing: nim/meta/llama-3.3-70b-instruct.

Connect to a model

# Llama 3.3 70B
provider.connect("meta/llama-3.3-70b-instruct")

# DeepSeek-R1 (reasoning)
provider.connect("deepseek/deepseek-r1")

# Nemotron (NVIDIA-tuned)
provider.connect("nvidia/llama-3.1-nemotron-70b-instruct")

Reasoning model output

provider.connect("deepseek/deepseek-r1")
provider.create_session(
    system_instruction="You are helpful.",
    tools=[]
)

response = provider.send_message(
    "What is 25 * 37?"
)

if response.thinking:
    print("=== Reasoning ===")
    print(response.thinking)
    print()

print("=== Answer ===")
print(response.text)

Authentication

NVIDIA's hosted NIM API requires an API key (nvapi-...). Self-hosted NIM containers run without authentication.

Get Your API Key

Visit build.nvidia.com
Sign in → Settings → API Keys → Generate

Credential Priority

Priority	Source
1 (highest)	`JAATO_NIM_API_KEY` environment variable
2	Stored credentials (`.jaato/nim_auth.json`)
3	`api_key` in `ProviderConfig`
—	Self-hosted endpoints need no key

Credential Storage

Credentials are stored in JSON files with Unix permissions 0600:

Project: .jaato/nim_auth.json (if .jaato/ exists)
User: ~/.jaato/nim_auth.json (fallback)

Environment variable

# .env file
JAATO_NIM_API_KEY=nvapi-your-key-here
JAATO_PROVIDER=nim
JAATO_NIM_MODEL=meta/llama-3.3-70b-instruct

Interactive login (TUI)

# Login - shows instructions
nim-auth login

# Store API key directly
nim-auth key nvapi-your-key-here

# Check status
nim-auth status

# Logout - clears stored credentials
nim-auth logout

Programmatic login

from shared.plugins.model_provider.nim.auth import (
    login_with_key,
    validate_api_key,
)

# Validate and store
creds = login_with_key(
    api_key="nvapi-your-key",
    on_message=lambda msg: print(msg)
)

# Or just validate
valid, detail = validate_api_key("nvapi-your-key")
print(f"Valid: {valid}")

Self-Hosted NIM

NVIDIA NIM containers can run locally or on your own infrastructure. Self-hosted instances on localhost or private networks don't require an API key.

Detected as Self-Hosted

The provider automatically detects self-hosted endpoints by hostname:

localhost / 127.0.0.1
192.168.* / 10.* (private networks)

Self-hosted configuration

# .env file for self-hosted
JAATO_PROVIDER=nim
JAATO_NIM_BASE_URL=http://localhost:8000/v1
JAATO_NIM_MODEL=meta/llama-3.1-8b-instruct

Docker example

# Pull and run a NIM container
docker run --gpus all \
  -p 8000:8000 \
  nvcr.io/nim/meta/llama-3.1-8b-instruct:latest

# Connect jaato
export JAATO_NIM_BASE_URL=http://localhost:8000/v1
export JAATO_PROVIDER=nim

Configuration

Environment Variables

Variable	Default	Description
`JAATO_NIM_API_KEY`	—	API key (`nvapi-...`) for hosted NIM
`JAATO_NIM_BASE_URL`	`https://integrate.api.nvidia.com/v1`	API endpoint URL
`JAATO_NIM_MODEL`	—	Default model name
`JAATO_NIM_CONTEXT_LENGTH`	`32768`	Context window size override

ProviderConfig.extra Options

Key	Type	Default	Description
`base_url`	str	See env	Override API base URL
`context_length`	int	32768	Override context window

Full .env example

# .env
JAATO_PROVIDER=nim
JAATO_NIM_API_KEY=nvapi-your-key-here
JAATO_NIM_MODEL=meta/llama-3.3-70b-instruct
JAATO_NIM_CONTEXT_LENGTH=131072

Full programmatic config

from shared import load_provider, ProviderConfig

provider = load_provider("nim")
provider.initialize(ProviderConfig(
    api_key="nvapi-your-key",
    extra={
        'base_url': 'https://integrate.api.nvidia.com/v1',
        'context_length': 131072,
    }
))

provider.connect("meta/llama-3.3-70b-instruct")

provider.create_session(
    system_instruction="You are a helpful assistant.",
    tools=[...]
)

response = provider.send_message("Hello!")
print(response.text)

Error Handling

The provider maps OpenAI SDK exceptions to NIM-specific error types for consistent error handling.

Exception	Cause
`APIKeyNotFoundError`	No API key found and endpoint is not self-hosted
`AuthenticationError`	API key rejected (401/403)
`RateLimitError`	Rate limit exceeded (429)
`ModelNotFoundError`	Invalid model name (404)
`ContextLimitError`	Input too large for model context
`InfrastructureError`	Server error (5xx) or connection failure

Handle errors

from shared.plugins.model_provider.nim.errors import (
    APIKeyNotFoundError,
    AuthenticationError,
    RateLimitError,
)

try:
    provider.initialize(config)
    provider.send_message("Hello")
except APIKeyNotFoundError:
    print("Set JAATO_NIM_API_KEY or run: nim-auth login")
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    if e.retry_after:
        print(f"Rate limited, retry in {e.retry_after}s")

Corporate Proxy Support

The provider supports corporate proxy environments including:

HTTP/HTTPS proxies via standard environment variables
Custom CA certificate bundles
Kerberos/SPNEGO proxy authentication

Configuration uses the same proxy settings as other jaato providers. See the Proxy & Kerberos guide for details.

Proxy configuration

# Standard proxy
export HTTPS_PROXY=http://proxy.corp.com:8080

# Corporate CA bundle
export REQUESTS_CA_BUNDLE=/path/to/ca-bundle.crt

# Kerberos proxy auth
export JAATO_KERBEROS_PROXY=true