NVIDIA NIM Provider
Access AI models through NVIDIA NIM (Inference Microservices), supporting both NVIDIA's hosted API at build.nvidia.com and self-hosted NIM containers. Uses the OpenAI-compatible chat completions API.
| Provider Name | nim |
| Module | shared.plugins.model_provider.nim |
| SDK | openai (OpenAI-compatible API) |
| Auth | API key (hosted) or none (self-hosted) |
Highlights
- NIM Catalog — Access Llama, DeepSeek-R1, Nemotron, Mistral, and more
- Self-hosted support — Run NIM containers locally with no API key
- Reasoning models — DeepSeek-R1 with chain-of-thought via
reasoning_content - Function calling — Full tool use support via OpenAI-compatible API
- Streaming — Real-time streaming with cancellation support
openai Python SDK as its transport layer, making
it compatible with any model in the NIM catalog.
from jaato import JaatoClient
client = JaatoClient(provider_name="nim")
client.connect(
project=None,
location=None,
model="meta/llama-3.3-70b-instruct"
)
client.configure_tools(registry)
response = client.send_message(
"Hello from NIM!",
on_output=on_output
)
from shared import load_provider, ProviderConfig
provider = load_provider("nim")
provider.initialize(ProviderConfig(
extra={
'base_url': 'http://localhost:8000/v1',
}
))
# No API key needed for self-hosted
provider.connect("meta/llama-3.1-8b-instruct")
response = provider.send_message("Hello!")
print(response.text)
Available Models
NIM hosts a large catalog of models. Below are some popular options. See build.nvidia.com for the full catalog.
Popular Models
| Model | Reasoning | Notes |
|---|---|---|
| meta/llama-3.3-70b-instruct | No | Strong general-purpose reasoning |
| meta/llama-3.1-405b-instruct | No | Largest open model |
| meta/llama-3.1-70b-instruct | No | Good balance of speed and quality |
| meta/llama-3.1-8b-instruct | No | Fast, lightweight |
| deepseek/deepseek-r1 | Yes | Chain-of-thought reasoning |
| nvidia/llama-3.1-nemotron-70b-instruct | No | NVIDIA-tuned Llama |
org/model-name. When using
with jaato, prefix with nim/ for provider routing:
nim/meta/llama-3.3-70b-instruct.
# Llama 3.3 70B
provider.connect("meta/llama-3.3-70b-instruct")
# DeepSeek-R1 (reasoning)
provider.connect("deepseek/deepseek-r1")
# Nemotron (NVIDIA-tuned)
provider.connect("nvidia/llama-3.1-nemotron-70b-instruct")
provider.connect("deepseek/deepseek-r1")
provider.create_session(
system_instruction="You are helpful.",
tools=[]
)
response = provider.send_message(
"What is 25 * 37?"
)
if response.thinking:
print("=== Reasoning ===")
print(response.thinking)
print()
print("=== Answer ===")
print(response.text)
Authentication
NVIDIA's hosted NIM API requires an API key (nvapi-...).
Self-hosted NIM containers run without authentication.
Get Your API Key
- Visit build.nvidia.com
- Sign in → Settings → API Keys → Generate
Credential Priority
| Priority | Source |
|---|---|
| 1 (highest) | JAATO_NIM_API_KEY environment variable |
| 2 | Stored credentials (.jaato/nim_auth.json) |
| 3 | api_key in ProviderConfig |
| — | Self-hosted endpoints need no key |
Credential Storage
Credentials are stored in JSON files with Unix permissions
0600:
- Project:
.jaato/nim_auth.json(if.jaato/exists) - User:
~/.jaato/nim_auth.json(fallback)
# .env file
JAATO_NIM_API_KEY=nvapi-your-key-here
JAATO_PROVIDER=nim
JAATO_NIM_MODEL=meta/llama-3.3-70b-instruct
# Login - shows instructions
nim-auth login
# Store API key directly
nim-auth key nvapi-your-key-here
# Check status
nim-auth status
# Logout - clears stored credentials
nim-auth logout
from shared.plugins.model_provider.nim.auth import (
login_with_key,
validate_api_key,
)
# Validate and store
creds = login_with_key(
api_key="nvapi-your-key",
on_message=lambda msg: print(msg)
)
# Or just validate
valid, detail = validate_api_key("nvapi-your-key")
print(f"Valid: {valid}")
Self-Hosted NIM
NVIDIA NIM containers can run locally or on your own infrastructure. Self-hosted instances on localhost or private networks don't require an API key.
Detected as Self-Hosted
The provider automatically detects self-hosted endpoints by hostname:
localhost/127.0.0.1192.168.*/10.*(private networks)
# .env file for self-hosted
JAATO_PROVIDER=nim
JAATO_NIM_BASE_URL=http://localhost:8000/v1
JAATO_NIM_MODEL=meta/llama-3.1-8b-instruct
# Pull and run a NIM container
docker run --gpus all \
-p 8000:8000 \
nvcr.io/nim/meta/llama-3.1-8b-instruct:latest
# Connect jaato
export JAATO_NIM_BASE_URL=http://localhost:8000/v1
export JAATO_PROVIDER=nim
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
JAATO_NIM_API_KEY |
— | API key (nvapi-...) for hosted NIM |
JAATO_NIM_BASE_URL |
https://integrate.api.nvidia.com/v1 |
API endpoint URL |
JAATO_NIM_MODEL |
— | Default model name |
JAATO_NIM_CONTEXT_LENGTH |
32768 |
Context window size override |
ProviderConfig.extra Options
| Key | Type | Default | Description |
|---|---|---|---|
base_url |
str | See env | Override API base URL |
context_length |
int | 32768 | Override context window |
# .env
JAATO_PROVIDER=nim
JAATO_NIM_API_KEY=nvapi-your-key-here
JAATO_NIM_MODEL=meta/llama-3.3-70b-instruct
JAATO_NIM_CONTEXT_LENGTH=131072
from shared import load_provider, ProviderConfig
provider = load_provider("nim")
provider.initialize(ProviderConfig(
api_key="nvapi-your-key",
extra={
'base_url': 'https://integrate.api.nvidia.com/v1',
'context_length': 131072,
}
))
provider.connect("meta/llama-3.3-70b-instruct")
provider.create_session(
system_instruction="You are a helpful assistant.",
tools=[...]
)
response = provider.send_message("Hello!")
print(response.text)
Error Handling
The provider maps OpenAI SDK exceptions to NIM-specific error types for consistent error handling.
| Exception | Cause |
|---|---|
APIKeyNotFoundError |
No API key found and endpoint is not self-hosted |
AuthenticationError |
API key rejected (401/403) |
RateLimitError |
Rate limit exceeded (429) |
ModelNotFoundError |
Invalid model name (404) |
ContextLimitError |
Input too large for model context |
InfrastructureError |
Server error (5xx) or connection failure |
from shared.plugins.model_provider.nim.errors import (
APIKeyNotFoundError,
AuthenticationError,
RateLimitError,
)
try:
provider.initialize(config)
provider.send_message("Hello")
except APIKeyNotFoundError:
print("Set JAATO_NIM_API_KEY or run: nim-auth login")
except AuthenticationError:
print("Invalid API key")
except RateLimitError as e:
if e.retry_after:
print(f"Rate limited, retry in {e.retry_after}s")
Corporate Proxy Support
The provider supports corporate proxy environments including:
- HTTP/HTTPS proxies via standard environment variables
- Custom CA certificate bundles
- Kerberos/SPNEGO proxy authentication
Configuration uses the same proxy settings as other jaato providers. See the Proxy & Kerberos guide for details.
# Standard proxy
export HTTPS_PROXY=http://proxy.corp.com:8080
# Corporate CA bundle
export REQUESTS_CA_BUNDLE=/path/to/ca-bundle.crt
# Kerberos proxy auth
export JAATO_KERBEROS_PROXY=true