GitHub Models Provider

Access multiple AI models (GPT, Claude, Gemini, Llama, Mistral) through GitHub's unified Models API. Supports individual, organization, and enterprise billing with fine-grained access control.

Name	`github_models`
SDK	`azure-ai-inference`
Models	GPT-4o, Claude 3.5, Gemini, Llama, Mistral
Max Context	Up to 200K tokens (Claude), 128K (GPT-4o)

Available Models

Provider	Models	Context
OpenAI	gpt-4o, gpt-4o-mini, o1-preview, o1-mini	128K
Anthropic	claude-3.5-sonnet, claude-3-opus, claude-3-haiku	200K
Google	gemini-1.5-pro, gemini-1.5-flash	1M
Meta	llama-3.1-405b, llama-3.1-70b, llama-3.1-8b	128K
Mistral	mistral-large, mistral-small	32K

Quick start

from shared import load_provider, ProviderConfig

provider = load_provider("github_models")

# Initialize with token
provider.initialize(ProviderConfig(
    api_key="github_pat_...",  # or from GITHUB_TOKEN env
))

# Connect to a model
provider.connect("openai/gpt-4o")

# Start chatting
provider.create_session(
    system_instruction="You are helpful."
)
response = provider.send_message("Hello!")
print(response.text)

With organization billing

provider.initialize(ProviderConfig(
    api_key="github_pat_...",
    extra={
        'organization': 'my-org',
        'enterprise': 'my-enterprise'
    }
))

Free Tier (Personal Use)

All GitHub accounts have free access to GitHub Models - no Copilot subscription or credit card required. Perfect for prototyping, learning, and personal projects.

Free Tier Rate Limits

Model Tier	RPM	RPD	Tokens/Request
High (GPT-4o, Claude)	10	50	8K in / 4K out
Low (GPT-4o-mini, Llama)	15	150	8K in / 4K out
Embedding	15	150	64K in

RPM = Requests per minute, RPD = Requests per day

What You Get Free

No credit card - just a GitHub account
All models - GPT-4o, Claude, Gemini, Llama, Mistral
Playground UI - test at github.com/marketplace/models
REST API - programmatic access with your PAT
Per-model limits - each model has separate quota

Tip: Maximize Free Usage

Each model has its own rate limit. Use GPT-4o-mini (150/day) for simple tasks and save GPT-4o (50/day) for complex ones. Across all models, you can make ~2000+ requests per day.

Quick start (personal account)

1. Go to https://github.com/settings/tokens?type=beta
2. Click "Generate new token"
3. Name: "jaato-models"
4. Expiration: 90 days
5. Account permissions → Models → Read
6. Click "Generate token"
7. Copy the token

Set token and use

# Set environment variable
export GITHUB_TOKEN=github_pat_...your-token

from shared import load_provider

provider = load_provider("github_models")
provider.initialize()  # Uses GITHUB_TOKEN
provider.connect("openai/gpt-4o-mini")  # Higher free limit

provider.create_session()
response = provider.send_message("Hello!")
print(response.text)

Test with curl

curl -X POST \
  -H "Authorization: Bearer $GITHUB_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o-mini",
    "messages": [{"role":"user","content":"Hi"}]
  }' \
  https://models.inference.ai.azure.com/chat/completions

Authentication

GitHub Models uses GitHub tokens for authentication. The recommended approach is a fine-grained Personal Access Token (PAT) with the models: read permission.

Token Type	Prefix	Enterprise SSO	Best For
Fine-grained PAT	`github_pat_`	Auto-authorized	Recommended for all use cases
Classic PAT	`ghp_`	Manual per-org	Simpler setup, less secure
GitHub App	`ghs_`	Supported	Automation, CI/CD
OAuth token	`gho_`	Supported	User-facing apps

Fine-Grained PAT (Recommended)

Fine-grained PATs provide granular permissions and are automatically authorized for SSO-enabled organizations during creation.

Go to GitHub Token Settings
Click "Generate new token"
Set name and expiration
Under Account permissions → Models → select Read
Click "Generate token"

Classic PAT

Classic PATs are simpler but less secure. No specific scope is needed for GitHub Models, but you must manually authorize for each SSO-enabled organization.

Enterprise SSO

For organizations with SAML SSO enabled, fine-grained PATs are automatically authorized during creation. Classic PATs require manual authorization at github.com/settings/tokens.

Create fine-grained PAT

1. Go to https://github.com/settings/tokens?type=beta
2. Click "Generate new token"
3. Set token name: "jaato-models"
4. Set expiration (e.g., 90 days)
5. Expand "Account permissions"
6. Find "Models" → Select "Read"
7. Click "Generate token"
8. Copy the token (starts with github_pat_)

Set environment variable

# In your .env file (add to .gitignore)
GITHUB_TOKEN=github_pat_...your-token

# Or export directly (session only)
export GITHUB_TOKEN=github_pat_...your-token

Verify token with gh CLI

# Check authentication works
gh api /user -q '.login'

# List available organizations
gh api /user/orgs -q '.[].login'

Test API access

# Test GitHub Models endpoint
curl -H "Authorization: Bearer $GITHUB_TOKEN" \
  https://models.inference.ai.azure.com/models

Enterprise Configuration

GitHub Models supports enterprise billing attribution and policy compliance. Usage can be tracked at the organization or enterprise level.

Organization Billing

Set the organization parameter to attribute API usage to a specific organization for billing purposes.

Enterprise Policy

Enterprise owners control GitHub Models access through settings at:

https://github.com/enterprises/{enterprise}/settings/models

Policy	Effect
Enabled	GitHub Models available for all orgs
No policy	Org admins decide individually
Disabled	GitHub Models blocked enterprise-wide

Models Disabled Error

If you receive 401 "GitHub Models is disabled", contact your enterprise administrator to enable GitHub Models in the enterprise settings.

Organization-attributed billing

from shared import load_provider, ProviderConfig

provider = load_provider("github_models")
provider.initialize(ProviderConfig(
    api_key="github_pat_...",
    extra={
        'organization': 'my-org',  # Bill to this org
    }
))

# API requests now attributed to my-org

Full enterprise config

provider.initialize(ProviderConfig(
    api_key="github_pat_...",
    extra={
        'organization': 'my-org',
        'enterprise': 'my-enterprise',
    }
))

Environment-based config

# .env file
GITHUB_TOKEN=github_pat_...
JAATO_GITHUB_ORGANIZATION=my-org
JAATO_GITHUB_ENTERPRISE=my-enterprise

# Auto-detects from environment
provider = load_provider("github_models")
provider.initialize()  # Uses env vars

Find your organization

# From GitHub URL
# https://github.com/my-org → organization = "my-org"

# Or via gh CLI
gh api /user/orgs -q '.[].login'

Environment Variables

Required

Variable	Description
`GITHUB_TOKEN`	GitHub PAT with `models: read` permission

Optional

Variable	Description
`JAATO_GITHUB_ORGANIZATION`	Organization for billing attribution
`JAATO_GITHUB_ENTERPRISE`	Enterprise name (for context/debugging)
`JAATO_GITHUB_ENDPOINT`	Override API endpoint URL

Default Endpoint

The default endpoint is https://models.inference.ai.azure.com. Override with JAATO_GITHUB_ENDPOINT for GitHub Enterprise Server or proxy configurations.

.env for individual use

# Minimal setup
GITHUB_TOKEN=github_pat_...your-token

.env for organization

# With organization billing
GITHUB_TOKEN=github_pat_...your-token
JAATO_GITHUB_ORGANIZATION=my-org

.env for enterprise

# Full enterprise setup
GITHUB_TOKEN=github_pat_...your-token
JAATO_GITHUB_ORGANIZATION=my-org
JAATO_GITHUB_ENTERPRISE=my-enterprise

Custom endpoint (GHES)

# For GitHub Enterprise Server
GITHUB_TOKEN=github_pat_...
JAATO_GITHUB_ENDPOINT=https://models.ghes.example.com

Error Handling

The provider uses fail-fast validation and throws specific exceptions with actionable error messages.

Exception	Cause
`TokenNotFoundError`	No GitHub token found
`TokenInvalidError`	Token rejected or expired
`TokenPermissionError`	Missing `models: read` permission
`ModelsDisabledError`	GitHub Models disabled for org/enterprise
`ModelNotFoundError`	Invalid model ID
`RateLimitError`	Too many requests

All exceptions include:

What was checked
Why it failed
How to fix it

Handling auth errors

from shared.plugins.model_provider.github_models.errors import (
    TokenNotFoundError,
    TokenInvalidError,
    TokenPermissionError,
    ModelsDisabledError,
)

try:
    provider.initialize(config)
except TokenNotFoundError as e:
    print(f"No token: {e}")
    # Includes instructions to create PAT
except TokenInvalidError as e:
    print(f"Bad token: {e}")
    # Identifies token type from prefix
except TokenPermissionError as e:
    print(f"Missing permission: {e}")
    # Includes SSO authorization steps
except ModelsDisabledError as e:
    print(f"Models disabled: {e}")
    # Shows admin settings URL

Example error message

TokenNotFoundError: No GitHub token found for authentication method: auto

Checked locations:
  - GITHUB_TOKEN: not set

To fix:
  1. Create a Personal Access Token (PAT) at https://github.com/settings/tokens
  2. For fine-grained PAT: select 'models: read' permission
  3. Set GITHUB_TOKEN=your-token

For GitHub Enterprise with SSO:
  - Fine-grained PATs are auto-authorized during creation
  - Classic PATs require manual SSO authorization per organization

Pricing

GitHub Models uses a token-unit pricing model. Usage is billed at $0.00001 per token unit.

Token Units

Token units are calculated by multiplying tokens by model-specific multipliers:

units = (input_tokens × input_mult) + (output_tokens × output_mult)

Free Tier

All GitHub accounts have rate-limited free access to GitHub Models for prototyping and experimentation. Limits vary by:

Model (some models have lower free limits)
GitHub Copilot plan (higher limits with paid plans)

Paid Usage

Organizations must explicitly enable paid usage at:

https://github.com/organizations/{org}/settings/billing

Token usage tracking

# Get usage from last response
response = provider.send_message("Hello!")

usage = provider.get_token_usage()
print(f"Input tokens: {usage.prompt_tokens}")
print(f"Output tokens: {usage.output_tokens}")
print(f"Total tokens: {usage.total_tokens}")

Check context limits

# Context window varies by model
provider.connect("openai/gpt-4o")
limit = provider.get_context_limit()
print(f"Context: {limit:,} tokens")
# Context: 128,000 tokens

provider.connect("anthropic/claude-3.5-sonnet")
limit = provider.get_context_limit()
print(f"Context: {limit:,} tokens")
# Context: 200,000 tokens

Troubleshooting

No token found

Set GITHUB_TOKEN environment variable
Or pass api_key in ProviderConfig

401 Unauthorized

Verify token hasn't expired
Check token has models: read permission
For classic PATs with SSO: authorize at github.com/settings/tokens

GitHub Models is disabled

Enterprise admin must enable at enterprise settings
Or org admin at organization settings
Check the enterprise policy allows "Enabled" or "No policy"

Model not found

Use full model ID: provider/model-name
Examples: openai/gpt-4o, anthropic/claude-3.5-sonnet
Check model availability in your subscription tier

Rate limit exceeded

Wait for the retry period
Use a different model with higher limits
Upgrade GitHub Copilot plan
Enable paid usage for organization

Debug checklist

# 1. Check token is set
echo $GITHUB_TOKEN | head -c 20

# 2. Test GitHub auth
gh api /user -q '.login'

# 3. Check org membership
gh api /user/orgs -q '.[].login'

# 4. Test models endpoint
curl -s -o /dev/null -w "%{http_code}" \
  -H "Authorization: Bearer $GITHUB_TOKEN" \
  https://models.inference.ai.azure.com/models

Test connectivity

# Quick test script
from shared import load_provider

provider = load_provider("github_models")

try:
    provider.initialize()
    provider.connect("openai/gpt-4o")

    response = provider.generate("Say hello")
    print(f"Success: {response.text}")
except Exception as e:
    print(f"Error: {e}")

List available models

# See what models are available
provider = load_provider("github_models")

# List all known models
models = provider.list_models()
for model in models:
    print(model)

# Filter by provider
openai = provider.list_models(prefix="openai/")
anthropic = provider.list_models(prefix="anthropic/")