Model Harness Architecture

A model harness is the complete runtime environment that wraps an AI model, transforming raw language model capabilities into a controlled, capable, and safe agentic system. JAATO's harness comprises three interconnected layers: Instructions, Tools, and Permissions.

Model Harness Architecture Infography
Click to open full-size image in a new tab

What is a Model Harness?

A model harness is the infrastructure that configures the model with context, instructions, and capabilities; mediates all interactions between the model and the external world; enforces safety boundaries and operational constraints; and tracks resource usage, actions, and outcomes.

Without a harness, a model can only generate text. With it, the model gains structured tool use, permission-gated actions, project-aware context, auditable execution, and configurable boundaries.

ChallengeWithout HarnessWith Harness
CapabilityText generation onlyExecute code, edit files, search web
ContextNo project knowledgeUnderstands codebase, conventions, goals
SafetyNo control over actionsPermissions gate sensitive operations
ConsistencyUnpredictable behaviorInstructions enforce consistent behavior
AccountabilityNo audit trailAll actions logged with metadata

The Three Harness Layers

JAATO's harness consists of three complementary layers, each serving a distinct purpose:

Layer 1: Instructions (The Mind)

Instructions shape the model's understanding, behavior, and decision-making. They define what the model knows.

  • Base system instructions — Behavioral rules from .jaato/system_instructions.md
  • Session-specific instructions — Programmatic customization per task
  • Plugin instructions — Tool usage guides contributed by each plugin
  • Framework constants — Task completion guidance, parallel tool hints
  • Prompt enrichment — Dynamic injection of references, templates, and memory

Layer 2: Tools (The Hands)

Tools give the model the ability to affect the world beyond text generation. They define what the model can do.

  • Core tools — Always available (~14 tools, ~1,200 tokens): introspection, file reading, shell access, TODO system
  • Discoverable tools — Loaded on-demand via introspection (~85+ tools): file editing, web search, subagents, and more
  • MCP server tools — External integrations via the Model Context Protocol (dynamic count)

Layer 3: Permissions (The Guardrails)

Permissions ensure the model's capabilities are exercised safely and with appropriate oversight. They define what the model is allowed to do.

  • Auto-approved tools — Safe, read-only operations pass without prompting
  • Policy evaluation — Deterministic pipeline: Sanitization → Blacklist → Whitelist → Default
  • User approval channels — Interactive console, webhook, or file-based approval
  • Suspension scopes — Turn, idle, and session-wide approval scopes

Layer Interactions

The layers are not independent — they form a coordinated system:

InteractionExample
Instructions → ToolsPlugin instructions teach the model how to use each tool
Instructions → PermissionsBase instructions can mandate permission-seeking behavior
Tools → PermissionsEach tool call is gated by the permission system
Permissions → InstructionsPermission decisions inject metadata into tool results
Tools → InstructionsTool schemas consume part of the model's context budget

Request Lifecycle

A typical request flows through all three layers. For example, when a user asks "Add logging to the authentication module":

  1. Prompt enrichment (Instructions layer) — Injects references and memory hints
  2. Model generation — Guided by system instructions, decides to read auth files first
  3. Tool call: readFile (Tools layer) — File reading capability
  4. Permission check (Permissions layer) — readFile is auto-approved, no user prompt
  5. Model continues — Analyzes code, decides to update the file
  6. Tool call: updateFile (Tools layer) — File modification capability
  7. Permission check (Permissions layer) — updateFile requires explicit user approval; diff is displayed
  8. User approves — File is updated, permission metadata recorded
  9. Model responds — Confirms the change to the user

Safety vs Capability Trade-off

The harness operates on a spectrum from maximum safety to maximum autonomy. JAATO's default is a balanced position:

DimensionConservativeBalanced (Default)Permissive
ToolsCore onlyCore + discoverable on-demandAll loaded upfront
PermissionsAsk for everythingAuto-read, ask-writeAuto-all
InstructionsExtensive guardrailsStandard guidanceMinimal
ScopeSingle approvalTurn/idle scopesSession-wide

Harness Profiles

Three common profiles illustrate how the layers can be tuned:

ProfileInstructionsToolsPermissionsUse Case
Supervised Detailed behavioral constraints Core only, discoverable disabled Ask for every write Sensitive production systems, learning scenarios
Collaborative Standard guidance Core + discoverable on-demand Auto-read, ask-write with turn/idle scopes General development, code review, refactoring
Autonomous Minimal, goal-focused All tools loaded upfront Suspended for session Trusted automation, batch processing, CI/CD

Token Budget

The harness consumes context tokens. In a typical collaborative session, the overhead is approximately 3,700 tokens — about 3% of a 128K context window or 12% of a 32K window. Key optimization levers:

  • JAATO_DEFERRED_TOOLS=true (default) — saves ~7,000 tokens by loading tool schemas on demand
  • Minimal base instructions — saves ~300 tokens
  • Fewer plugins — saves ~200-800 per plugin
  • GC (garbage collection) — reclaims conversation tokens when context is full
Back to Enterprise Overview