Anthropic Provider

Access Claude models (Claude 3.5, Claude 4, Claude Opus 4.5) through Anthropic's API with support for extended thinking, prompt caching, and function calling.

Name	`anthropic`
SDK	`anthropic`
Models	Claude 3.5, Claude 4, Claude Opus 4.5
Max Context	200K tokens (all models)

Available Models

Model	Context	Extended Thinking
claude-opus-4-5-20251101	200K	Yes
claude-sonnet-4-20250514	200K	Yes
claude-haiku-4-20250414	200K	No
claude-3-5-sonnet-20241022	200K	Yes
claude-3-5-haiku-20241022	200K	No

Quick start

from shared import load_provider, ProviderConfig

provider = load_provider("anthropic")

# Initialize with API key
provider.initialize(ProviderConfig(
    api_key="sk-ant-...",  # or from ANTHROPIC_API_KEY env
))

# Connect to a model
provider.connect("claude-sonnet-4-20250514")

# Start chatting
provider.create_session(
    system_instruction="You are helpful."
)
response = provider.send_message("Hello!")
print(response.text)

With extended thinking

provider.initialize(ProviderConfig(
    api_key="sk-ant-...",
    extra={
        'enable_thinking': True,
        'thinking_budget': 10000
    }
))

# Response includes reasoning
response = provider.send_message("Complex question")
print(f"Thinking: {response.thinking}")
print(f"Answer: {response.text}")

Extended Thinking

Extended thinking allows Claude to show its reasoning process before generating a response. This is useful for complex problems, debugging model behavior, or when you want transparency into the reasoning.

When to Use Extended Thinking

Complex reasoning - Math, logic, multi-step problems
Code analysis - Understanding large codebases
Decision making - Weighing pros and cons
Debugging - Understanding why the model chose an answer

Configuration Options

Option	Type	Default	Description
`enable_thinking`	bool	False	Enable extended thinking
`thinking_budget`	int	10000	Max tokens for thinking

Thinking Budget

The thinking budget limits how many tokens Claude can use for internal reasoning. Higher values allow more thorough analysis but increase latency and cost. Start with 10,000 and adjust based on task complexity.

Enable extended thinking

provider.initialize(ProviderConfig(
    api_key="sk-ant-...",
    extra={
        'enable_thinking': True,
        'thinking_budget': 15000  # More tokens for complex tasks
    }
))

Access thinking in response

response = provider.send_message(
    "Explain the trade-offs between microservices and monolith"
)

# Check if thinking is present
if response.has_thinking:
    print("=== Claude's reasoning ===")
    print(response.thinking)
    print()

print("=== Final answer ===")
print(response.text)

Example thinking output

=== Claude's reasoning ===
Let me think through the key trade-offs:

For microservices:
- Pros: Independent deployment, technology flexibility...
- Cons: Distributed systems complexity, network latency...

For monolith:
- Pros: Simpler to develop initially, no network overhead...
- Cons: Deployment coupling, scaling limitations...

I should structure this by category: development, deployment,
scaling, and operations...

=== Final answer ===
Here are the key trade-offs between microservices and monolith architectures:
...

Prompt Caching

Prompt caching reduces cost by up to 90% and improves latency by up to 85% for repeated prompts. It's automatically applied to system instructions and tool definitions.

How It Works

Cached content is stored for 5 minutes (refreshed on each use)
Cached reads cost 0.1x the normal input price
Cache writes cost 1.25x but only happen once
System instructions and tools are automatically cached

Best Use Cases

Scenario	Savings
Long system instructions	High - reused every message
Many tool definitions	High - reused every message
Document analysis	High - document cached across questions
Simple conversations	Low - little to cache

Automatic Caching

When caching is enabled, the provider automatically adds cache_control to system instructions and tool definitions. No manual configuration needed.

Enable prompt caching

provider.initialize(ProviderConfig(
    api_key="sk-ant-...",
    extra={
        'enable_caching': True
    }
))

Cost comparison (100K token prompt)

Without caching:
  100K tokens × $3.00/MTok = $0.30 per request

With caching:
  First request: $0.375 (1.25x write cost)
  Subsequent:    $0.03  (0.1x read cost)

After 2 requests: 75% savings
After 10 requests: 90% savings

Caching with tools

from jaato import ToolSchema

tools = [
    ToolSchema(name='search', description='...', parameters={...}),
    ToolSchema(name='execute', description='...', parameters={...}),
    ToolSchema(name='read_file', description='...', parameters={...}),
    # ... many more tools
]

provider.create_session(
    system_instruction="You are an AI assistant...",
    tools=tools  # Tools and system cached automatically
)

Function Calling

Claude supports function calling (tool use) for interacting with external systems. Tools are defined using JSON Schema and automatically converted to Anthropic's format.

Tool Schema Differences

Anthropic uses input_schema instead of parameters. The provider handles this conversion automatically.

jaato Format	Anthropic Format
`parameters`	`input_schema`
`function_call`	`tool_use` block
`function_response`	`tool_result` block

Automatic Conversion

You don't need to handle Anthropic-specific formats. Use the standard ToolSchema type and the provider converts automatically.

Define and use tools

from jaato import ToolSchema, ToolResult

# Define tool with standard ToolSchema
tools = [ToolSchema(
    name='get_weather',
    description='Get current weather for a location',
    parameters={
        "type": "object",
        "properties": {
            "location": {"type": "string", "description": "City name"}
        },
        "required": ["location"]
    }
)]

# Create session with tools
provider.create_session(
    system_instruction="You are a weather assistant.",
    tools=tools
)

# Model may request tool use
response = provider.send_message("What's the weather in Tokyo?")

if response.function_calls:
    fc = response.function_calls[0]
    print(f"Tool: {fc.name}, Args: {fc.args}")
    # Tool: get_weather, Args: {'location': 'Tokyo'}

    # Send result back
    result = ToolResult(
        call_id=fc.id,
        name=fc.name,
        result={"temp": 22, "condition": "sunny"}
    )
    response = provider.send_tool_results([result])
    print(response.text)
    # "The weather in Tokyo is 22°C and sunny!"

Authentication

Anthropic provider supports three authentication methods (in priority order):

1. PKCE OAuth Login (Recommended)

For Claude Pro/Max subscribers. Uses browser-based OAuth flow. No API key needed, uses your subscription quota.

2. OAuth Token

Token starting with sk-ant-oat01-... obtained from claude setup-token. Uses subscription quota.

3. API Key

Token starting with sk-ant-api03-... from Anthropic Console. Uses API credits (pay-per-use).

Environment Variables

Variable	Description
`ANTHROPIC_API_KEY`	API key (`sk-ant-api03-...`) - uses API credits
`ANTHROPIC_AUTH_TOKEN`	OAuth token (`sk-ant-oat01-...`) - uses subscription

Cost Savings

OAuth tokens use your Claude Pro/Max subscription instead of API credits. For heavy usage, this can save significant costs.

Environment variables

# Option 1: API Key (uses API credits)
ANTHROPIC_API_KEY=sk-ant-api03-...

# Option 2: OAuth Token (uses subscription)
ANTHROPIC_AUTH_TOKEN=sk-ant-oat01-...

# Option 3: PKCE OAuth - no env var needed

OAuth login (browser-based)

from shared.plugins.model_provider.anthropic import oauth_login

# Trigger browser-based OAuth
token = oauth_login(
    on_message=lambda msg: print(msg)
)
# Opens browser, user logs in
# Token is automatically saved

# Or via JaatoClient
client = JaatoClient(provider_name="anthropic")
client.connect(None, None, "claude-sonnet-4-20250514")
client.verify_auth(allow_interactive=True)

Get OAuth token via CLI

# Install Claude CLI
npm install -g @anthropic-ai/claude-code

# Get OAuth token
claude setup-token

# Token saved to ~/.claude/settings.json

Error Handling

The provider throws specific exceptions with actionable error messages for common failure scenarios.

Exception	Cause
`APIKeyNotFoundError`	No API key in env or config
`APIKeyInvalidError`	API key rejected (401)
`RateLimitError`	Too many requests (429)
`OverloadedError`	Anthropic API overloaded (529)
`ContextLimitError`	Context window exceeded
`ModelNotFoundError`	Invalid model ID

Handle errors

from shared.plugins.model_provider.anthropic.errors import (
    APIKeyNotFoundError,
    APIKeyInvalidError,
    RateLimitError,
    OverloadedError,
)

try:
    provider.initialize(config)
    provider.send_message("Hello")
except APIKeyNotFoundError:
    print("Set ANTHROPIC_API_KEY environment variable")
except APIKeyInvalidError:
    print("Check your API key at console.anthropic.com")
except RateLimitError:
    print("Too many requests, retry later")
except OverloadedError:
    print("Anthropic API is busy, retry in a moment")

Configuration Summary

Environment Variables

Variable	Required	Description
`ANTHROPIC_API_KEY`	Yes	Anthropic API key

ProviderConfig.extra Options

Key	Type	Default	Description
`enable_caching`	bool	False	Enable prompt caching
`enable_thinking`	bool	False	Enable extended thinking
`thinking_budget`	int	10000	Max thinking tokens

Full configuration example

from shared import load_provider, ProviderConfig

provider = load_provider("anthropic")
provider.initialize(ProviderConfig(
    api_key="sk-ant-...",  # Or use ANTHROPIC_API_KEY env
    extra={
        'enable_caching': True,     # 90% cost reduction
        'enable_thinking': True,    # Reasoning traces
        'thinking_budget': 15000,   # Max thinking tokens
    }
))

provider.connect("claude-sonnet-4-20250514")

# Create session
provider.create_session(
    system_instruction="You are a helpful assistant.",
    tools=[...]  # Optional tools
)

# Chat
response = provider.send_message("Explain quantum computing")
print(f"Thinking: {response.thinking}")
print(f"Answer: {response.text}")

.env file example

# .env
ANTHROPIC_API_KEY=sk-ant-api03-...
JAATO_PROVIDER=anthropic
MODEL_NAME=claude-sonnet-4-20250514