GitHub Models Provider

Access multiple AI models (GPT, Claude, Gemini, Llama, Mistral) through GitHub's unified Models API. Supports individual, organization, and enterprise billing with fine-grained access control.

Name github_models
SDK azure-ai-inference
Models GPT-4o, Claude 3.5, Gemini, Llama, Mistral
Max Context Up to 200K tokens (Claude), 128K (GPT-4o)

Available Models

Provider Models Context
OpenAI gpt-4o, gpt-4o-mini, o1-preview, o1-mini 128K
Anthropic claude-3.5-sonnet, claude-3-opus, claude-3-haiku 200K
Google gemini-1.5-pro, gemini-1.5-flash 1M
Meta llama-3.1-405b, llama-3.1-70b, llama-3.1-8b 128K
Mistral mistral-large, mistral-small 32K
Quick start
from shared import load_provider, ProviderConfig

provider = load_provider("github_models")

# Initialize with token
provider.initialize(ProviderConfig(
    api_key="github_pat_...",  # or from GITHUB_TOKEN env
))

# Connect to a model
provider.connect("openai/gpt-4o")

# Start chatting
provider.create_session(
    system_instruction="You are helpful."
)
response = provider.send_message("Hello!")
print(response.text)
With organization billing
provider.initialize(ProviderConfig(
    api_key="github_pat_...",
    extra={
        'organization': 'my-org',
        'enterprise': 'my-enterprise'
    }
))

Free Tier (Personal Use)

All GitHub accounts have free access to GitHub Models - no Copilot subscription or credit card required. Perfect for prototyping, learning, and personal projects.

Free Tier Rate Limits

Model Tier RPM RPD Tokens/Request
High (GPT-4o, Claude) 10 50 8K in / 4K out
Low (GPT-4o-mini, Llama) 15 150 8K in / 4K out
Embedding 15 150 64K in

RPM = Requests per minute, RPD = Requests per day

What You Get Free

  • No credit card - just a GitHub account
  • All models - GPT-4o, Claude, Gemini, Llama, Mistral
  • Playground UI - test at github.com/marketplace/models
  • REST API - programmatic access with your PAT
  • Per-model limits - each model has separate quota
Tip: Maximize Free Usage
Each model has its own rate limit. Use GPT-4o-mini (150/day) for simple tasks and save GPT-4o (50/day) for complex ones. Across all models, you can make ~2000+ requests per day.
Quick start (personal account)
1. Go to https://github.com/settings/tokens?type=beta
2. Click "Generate new token"
3. Name: "jaato-models"
4. Expiration: 90 days
5. Account permissions → Models → Read
6. Click "Generate token"
7. Copy the token
Set token and use
# Set environment variable
export GITHUB_TOKEN=github_pat_...your-token
from shared import load_provider

provider = load_provider("github_models")
provider.initialize()  # Uses GITHUB_TOKEN
provider.connect("openai/gpt-4o-mini")  # Higher free limit

provider.create_session()
response = provider.send_message("Hello!")
print(response.text)
Test with curl
curl -X POST \
  -H "Authorization: Bearer $GITHUB_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [{"role":"user","content":"Hi"}]
  }' \
  https://models.inference.ai.azure.com/chat/completions

Authentication

GitHub Models uses GitHub tokens for authentication. The recommended approach is a fine-grained Personal Access Token (PAT) with the models: read permission.

Token Type Prefix Enterprise SSO Best For
Fine-grained PAT github_pat_ Auto-authorized Recommended for all use cases
Classic PAT ghp_ Manual per-org Simpler setup, less secure
GitHub App ghs_ Supported Automation, CI/CD
OAuth token gho_ Supported User-facing apps

Fine-Grained PAT (Recommended)

Fine-grained PATs provide granular permissions and are automatically authorized for SSO-enabled organizations during creation.

  1. Go to GitHub Token Settings
  2. Click "Generate new token"
  3. Set name and expiration
  4. Under Account permissionsModels → select Read
  5. Click "Generate token"

Classic PAT

Classic PATs are simpler but less secure. No specific scope is needed for GitHub Models, but you must manually authorize for each SSO-enabled organization.

Enterprise SSO
For organizations with SAML SSO enabled, fine-grained PATs are automatically authorized during creation. Classic PATs require manual authorization at github.com/settings/tokens.
Create fine-grained PAT
1. Go to https://github.com/settings/tokens?type=beta
2. Click "Generate new token"
3. Set token name: "jaato-models"
4. Set expiration (e.g., 90 days)
5. Expand "Account permissions"
6. Find "Models" → Select "Read"
7. Click "Generate token"
8. Copy the token (starts with github_pat_)
Set environment variable
# In your .env file (add to .gitignore)
GITHUB_TOKEN=github_pat_...your-token

# Or export directly (session only)
export GITHUB_TOKEN=github_pat_...your-token
Verify token with gh CLI
# Check authentication works
gh api /user -q '.login'

# List available organizations
gh api /user/orgs -q '.[].login'
Test API access
# Test GitHub Models endpoint
curl -H "Authorization: Bearer $GITHUB_TOKEN" \
  https://models.inference.ai.azure.com/models

Enterprise Configuration

GitHub Models supports enterprise billing attribution and policy compliance. Usage can be tracked at the organization or enterprise level.

Organization Billing

Set the organization parameter to attribute API usage to a specific organization for billing purposes.

Enterprise Policy

Enterprise owners control GitHub Models access through settings at:

https://github.com/enterprises/{enterprise}/settings/models

Policy Effect
Enabled GitHub Models available for all orgs
No policy Org admins decide individually
Disabled GitHub Models blocked enterprise-wide
Models Disabled Error
If you receive 401 "GitHub Models is disabled", contact your enterprise administrator to enable GitHub Models in the enterprise settings.
Organization-attributed billing
from shared import load_provider, ProviderConfig

provider = load_provider("github_models")
provider.initialize(ProviderConfig(
    api_key="github_pat_...",
    extra={
        'organization': 'my-org',  # Bill to this org
    }
))

# API requests now attributed to my-org
Full enterprise config
provider.initialize(ProviderConfig(
    api_key="github_pat_...",
    extra={
        'organization': 'my-org',
        'enterprise': 'my-enterprise',
    }
))
Environment-based config
# .env file
GITHUB_TOKEN=github_pat_...
JAATO_GITHUB_ORGANIZATION=my-org
JAATO_GITHUB_ENTERPRISE=my-enterprise
# Auto-detects from environment
provider = load_provider("github_models")
provider.initialize()  # Uses env vars
Find your organization
# From GitHub URL
# https://github.com/my-org → organization = "my-org"

# Or via gh CLI
gh api /user/orgs -q '.[].login'

Environment Variables

Required

Variable Description
GITHUB_TOKEN GitHub PAT with models: read permission

Optional

Variable Description
JAATO_GITHUB_ORGANIZATION Organization for billing attribution
JAATO_GITHUB_ENTERPRISE Enterprise name (for context/debugging)
JAATO_GITHUB_ENDPOINT Override API endpoint URL

Default Endpoint

The default endpoint is https://models.inference.ai.azure.com. Override with JAATO_GITHUB_ENDPOINT for GitHub Enterprise Server or proxy configurations.

.env for individual use
# Minimal setup
GITHUB_TOKEN=github_pat_...your-token
.env for organization
# With organization billing
GITHUB_TOKEN=github_pat_...your-token
JAATO_GITHUB_ORGANIZATION=my-org
.env for enterprise
# Full enterprise setup
GITHUB_TOKEN=github_pat_...your-token
JAATO_GITHUB_ORGANIZATION=my-org
JAATO_GITHUB_ENTERPRISE=my-enterprise
Custom endpoint (GHES)
# For GitHub Enterprise Server
GITHUB_TOKEN=github_pat_...
JAATO_GITHUB_ENDPOINT=https://models.ghes.example.com

Error Handling

The provider uses fail-fast validation and throws specific exceptions with actionable error messages.

Exception Cause
TokenNotFoundError No GitHub token found
TokenInvalidError Token rejected or expired
TokenPermissionError Missing models: read permission
ModelsDisabledError GitHub Models disabled for org/enterprise
ModelNotFoundError Invalid model ID
RateLimitError Too many requests

All exceptions include:

  • What was checked
  • Why it failed
  • How to fix it
Handling auth errors
from shared.plugins.model_provider.github_models.errors import (
    TokenNotFoundError,
    TokenInvalidError,
    TokenPermissionError,
    ModelsDisabledError,
)

try:
    provider.initialize(config)
except TokenNotFoundError as e:
    print(f"No token: {e}")
    # Includes instructions to create PAT
except TokenInvalidError as e:
    print(f"Bad token: {e}")
    # Identifies token type from prefix
except TokenPermissionError as e:
    print(f"Missing permission: {e}")
    # Includes SSO authorization steps
except ModelsDisabledError as e:
    print(f"Models disabled: {e}")
    # Shows admin settings URL
Example error message
TokenNotFoundError: No GitHub token found for authentication method: auto

Checked locations:
  - GITHUB_TOKEN: not set

To fix:
  1. Create a Personal Access Token (PAT) at https://github.com/settings/tokens
  2. For fine-grained PAT: select 'models: read' permission
  3. Set GITHUB_TOKEN=your-token

For GitHub Enterprise with SSO:
  - Fine-grained PATs are auto-authorized during creation
  - Classic PATs require manual SSO authorization per organization

Pricing

GitHub Models uses a token-unit pricing model. Usage is billed at $0.00001 per token unit.

Token Units

Token units are calculated by multiplying tokens by model-specific multipliers:

units = (input_tokens × input_mult) + (output_tokens × output_mult)

Free Tier

All GitHub accounts have rate-limited free access to GitHub Models for prototyping and experimentation. Limits vary by:

  • Model (some models have lower free limits)
  • GitHub Copilot plan (higher limits with paid plans)

Paid Usage

Organizations must explicitly enable paid usage at:

https://github.com/organizations/{org}/settings/billing
Token usage tracking
# Get usage from last response
response = provider.send_message("Hello!")

usage = provider.get_token_usage()
print(f"Input tokens: {usage.prompt_tokens}")
print(f"Output tokens: {usage.output_tokens}")
print(f"Total tokens: {usage.total_tokens}")
Check context limits
# Context window varies by model
provider.connect("openai/gpt-4o")
limit = provider.get_context_limit()
print(f"Context: {limit:,} tokens")
# Context: 128,000 tokens

provider.connect("anthropic/claude-3.5-sonnet")
limit = provider.get_context_limit()
print(f"Context: {limit:,} tokens")
# Context: 200,000 tokens

Troubleshooting

No token found

  • Set GITHUB_TOKEN environment variable
  • Or pass api_key in ProviderConfig

401 Unauthorized

  • Verify token hasn't expired
  • Check token has models: read permission
  • For classic PATs with SSO: authorize at github.com/settings/tokens

GitHub Models is disabled

  • Enterprise admin must enable at enterprise settings
  • Or org admin at organization settings
  • Check the enterprise policy allows "Enabled" or "No policy"

Model not found

  • Use full model ID: provider/model-name
  • Examples: openai/gpt-4o, anthropic/claude-3.5-sonnet
  • Check model availability in your subscription tier

Rate limit exceeded

  • Wait for the retry period
  • Use a different model with higher limits
  • Upgrade GitHub Copilot plan
  • Enable paid usage for organization
Debug checklist
# 1. Check token is set
echo $GITHUB_TOKEN | head -c 20

# 2. Test GitHub auth
gh api /user -q '.login'

# 3. Check org membership
gh api /user/orgs -q '.[].login'

# 4. Test models endpoint
curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $GITHUB_TOKEN" \
  https://models.inference.ai.azure.com/models
Test connectivity
# Quick test script
from shared import load_provider

provider = load_provider("github_models")

try:
    provider.initialize()
    provider.connect("openai/gpt-4o")

    response = provider.generate("Say hello")
    print(f"Success: {response.text}")
except Exception as e:
    print(f"Error: {e}")
List available models
# See what models are available
provider = load_provider("github_models")

# List all known models
models = provider.list_models()
for model in models:
    print(model)

# Filter by provider
openai = provider.list_models(prefix="openai/")
anthropic = provider.list_models(prefix="anthropic/")