Server-Client Event Protocol

JAATO uses a server-first architecture where all communication between server and client flows through a typed event protocol — a set of 40+ JSON-serializable dataclasses representing every meaningful state change. This design decouples the agentic runtime from UI concerns, enabling multiple clients to observe the same session simultaneously.

Click to open full-size image in a new tab

Why an Event Protocol?

Without events, the UI must be tightly coupled to the runtime: single client only, UI blocks the runtime, no remote clients, no reconnection. With events, the server runs as a daemon and emits semantic events, while clients decide how to render them. This enables:

Principle	How Achieved
Decoupling	Server emits semantic events; clients decide how to render them
Multi-client	Session manager broadcasts events to all attached clients
Reconnection	`emit_current_state()` replays full state on reconnect
Forward compatibility	Unknown fields are filtered during deserialization
Thread safety	Events queued via `call_soon_threadsafe()` from model threads

Event Categories

The 40+ event types are organized into functional categories:

Category	Count	Direction	Purpose
Connection	2	S→C	Client connect/disconnect lifecycle
Agent Lifecycle	4	S→C	Agent creation, output streaming, status, completion
Tool Execution	3	S→C	Tool start, live output, end with duration
Permission Flow	4	S↔C	Request, input mode, response, resolution
Clarification Flow	5	S↔C	Multi-question clarification sessions
Plan Management	2	S→C	Plan creation, step updates, completion
Context & Tokens	4	S→C	Token usage, budget breakdown, turn progress
System Messages	5	S→C	Info, errors, help, init progress, retries
Client Requests	8	C→S	Messages, commands, config, history
Agent Profiles	1	S→C	Available profile listing for session creation
Workspace Config	8	S↔C	Workspace list, create, select, configure

Connection Lifecycle

When a client connects, it receives a ConnectedEvent with protocol version and server_info (including server_version — the server's package version from pyproject.toml), followed by a SessionInfoEvent containing the full state snapshot: session metadata, available tools, model list, and command history. If the session was created with an agent profile, the SessionInfoEvent includes a profile_name field identifying which profile is active. Clients can compare server_version against their own minimum to refuse connection to obsolete servers. This allows the client to fully initialize without additional requests.

On reconnection (e.g., after network drop), the server calls emit_current_state() which replays all relevant events: session info, tracked agents and their status, instruction budgets, and clears any stale pending permission/clarification requests.

Agent Lifecycle Events

Each agent (main or subagent) follows a lifecycle represented by four event types:

AgentCreatedEvent — Agent registered with ID, type, optional profile name and icon
AgentOutputEvent — Streaming text chunks with source (model, tool, system), text, and mode (write, append, or flush). See Streaming Modes below.
AgentStatusChangedEvent — Transitions between active, idle, done, error. For the main agent, status="done" or "idle" is the completion signal (see Common Pitfalls).
AgentCompletedEvent — Final summary with token usage and turns used. Only emitted for subagents, not the main agent.

Tool Execution Events

Tool execution emits three event types, correlated via call_id for parallel execution:

ToolCallStartEvent — Tool name, arguments, call ID
ToolOutputEvent — Live output chunks (tail-style streaming for long-running tools)
ToolCallEndEvent — Success/failure, duration in seconds, error message if failed

During parallel execution, multiple ToolCallStartEvents are emitted concurrently, and their ToolOutputEvent/ToolCallEndEvent events may interleave. The call_id allows clients to correctly associate events with their originating tool call.

Permission & Clarification Flows

Permission is a four-event request-response cycle:

PermissionRequestedEvent (S→C) — Contains tool name, args, formatted prompt lines, format hint (e.g., "diff"), and available response options
PermissionInputModeEvent (S→C) — Signals client to switch input to permission mode
PermissionResponseRequest (C→S) — User's decision (y/n/a/t/i/all)
PermissionResolvedEvent (S→C) — Final grant/deny with method used

Clarification follows a similar pattern but supports multiple sequential questions, each with options for single or multi-choice answers.

Agent Profile Events

Agent profiles allow sessions to be created with predefined configurations (model, provider, tools, system instructions, GC settings). Profile-related events enable clients to discover and display available profiles.

SessionProfilesEvent (S→C) — Lists available agent profiles from the workspace's .jaato/profiles/ directory. Each profile entry includes name, description, model, provider, and icon_name. Emitted in response to a session.profiles command.

Additionally, the existing SessionInfoEvent now includes an optional profile_name field (str | None). When a session is created with a profile (via create_session(profile="...")), this field identifies the active profile. Clients can use this to display the profile name or icon in the session header.

IPC Commands

Command	Args	Response Event
`session.new`	`[name] [--profile <name>]`	`SessionInfoEvent` (with `profile_name` if profile used)
`session.profiles`	none	`SessionProfilesEvent`

Transport Layers

Events travel over two transport options:

Aspect	IPC (Unix Socket)	WebSocket
Protocol	Length-prefixed JSON (4-byte big-endian u32 + UTF-8)	Native WebSocket text frames
Max message	10 MB	Standard WS limits
Scope	Local machine only	Local or remote
Thread safety	`call_soon_threadsafe()`	`run_coroutine_threadsafe()`

Ordering Guarantees

Per-client FIFO — Events to a specific client maintain order
Broadcast consistency — All clients receive events in the same order
No batching — Each event is serialized and transmitted individually
At-most-once delivery — Disconnected clients miss events (recovered via emit_current_state)

Streaming Modes (write / append / flush)

The mode field on AgentOutputEvent controls how the client should handle each chunk:

Mode	`text`	Meaning
`"write"`	non-empty	Start a new output block. Previous block (if any) is finalized.
`"append"`	non-empty	Continue appending to the current output block.
`"flush"`	empty	Streaming text is done. Finalize buffered output now — tool calls are about to start.

The "flush" signal is the only way to detect that model text streaming has ended within a turn. There is no separate "StreamEndEvent". The session emits flush immediately before executing tool calls, giving clients a synchronization point to:

Finalize and render buffered text as one piece
Transition the UI from "streaming text" to "executing tools" state
Separate text output from tool output in non-streaming UIs (e.g., Telegram, Slack)

Important: If the model responds with text only (no tool calls), no flush is emitted — the next event is TurnCompletedEvent directly. Clients must also flush their buffers on TurnCompletedEvent.

Canonical Event Sequence Within a Single Model Response

AgentStatusChangedEvent(status="active")            ← agent starts processing
AgentOutputEvent(source="model", mode="write")      ← new text block
AgentOutputEvent(source="model", mode="append")     ← more chunks...
AgentOutputEvent(source="model", mode="append")     ← ...
AgentOutputEvent(source="system", text="", mode="flush")  ← text done, tools next
ToolCallStartEvent(tool_name="...")                  ← tool execution begins
ToolCallEndEvent(...)
...                                                  ← more tools if parallel
TurnProgressEvent(...)                               ← token accounting
— model may loop back (text → flush → tools) if tool results trigger more output —
TurnCompletedEvent(...)                              ← turn fully done (NOT terminal)
ContextUpdatedEvent(...)                             ← cumulative token usage
AgentStatusChangedEvent(status="done"|"idle")        ← ✅ TERMINAL for main agent

Client Implementation Guide (Output Buffering)

Custom clients (Telegram bots, Slack integrations, web UIs, etc.) that cannot render incremental streaming must buffer output and emit it in discrete blocks. This section describes the canonical buffering pattern.

The Problem

The server emits AgentOutputEvent chunks as they stream from the model — potentially dozens per second. Clients like Telegram cannot update a message per chunk. They need to know when text is done so they can send one complete message, followed by tool call information.

Buffering Pattern

from jaato_sdk.events import (
    AgentOutputEvent, AgentCompletedEvent,
    AgentStatusChangedEvent,
    ToolCallStartEvent, ToolCallEndEvent,
    TurnCompletedEvent, PermissionInputModeEvent,
)

text_buffer: list[str] = []
tool_calls: list[dict] = []

async for event in client.events():

    # --- Flush signal (check BEFORE source filtering) ---
    # Flush is emitted as source="system", mode="flush", text=""
    if isinstance(event, AgentOutputEvent) and event.mode == "flush":
        # Model text is done — emit buffered text now
        if text_buffer:
            send_message("".join(text_buffer))
            text_buffer.clear()

    # --- Model text streaming ---
    elif isinstance(event, AgentOutputEvent) and event.source == "model":
        if event.mode in ("write", "append"):
            text_buffer.append(event.text)

    # --- Tool execution ---
    elif isinstance(event, ToolCallStartEvent):
        tool_calls.append({"name": event.tool_name, "args": event.tool_args})

    elif isinstance(event, ToolCallEndEvent):
        # Update tool status, show summary, etc.
        pass

    # --- Permission requests ---
    elif isinstance(event, PermissionInputModeEvent):
        # Show permission UI, collect response, then:
        await client.respond_to_permission(
            request_id=event.request_id,
            response=user_choice,  # "y", "n", "a", "never", etc.
        )

    # --- Turn completed (NOT terminal — do NOT break here) ---
    elif isinstance(event, TurnCompletedEvent):
        # Flush any remaining text (text-only responses skip "flush")
        if text_buffer:
            send_message("".join(text_buffer))
            text_buffer.clear()
        # Show tool call summary if desired
        if tool_calls:
            send_tool_summary(tool_calls)
            tool_calls.clear()
        # Continue looping — multi-turn flows emit multiple TurnCompletedEvents

    # --- Agent status changed (TERMINAL for main agent) ---
    elif isinstance(event, AgentStatusChangedEvent):
        if event.status in ("done", "idle"):
            # Main agent finished — "done" = all work complete,
            # "idle" = waiting for next user input.
            # Both mean the current response is finished.
            if text_buffer:
                send_message("".join(text_buffer))
                text_buffer.clear()
            break

    # --- Agent completed (TERMINAL for subagents) ---
    elif isinstance(event, AgentCompletedEvent):
        # Only emitted for subagents, not the main agent.
        # Kept as a safety net.
        if text_buffer:
            send_message("".join(text_buffer))
            text_buffer.clear()
        break

Key Rules

Always flush on TurnCompletedEvent — text-only responses (no tool calls) skip the "flush" signal and go straight to turn completion.
mode="flush" has empty text — don't append it to the buffer. It's a control signal, not content.
Multiple flush cycles per turn — a turn with tool calls may loop: text → flush → tools → text → flush → tools → turn completed. Reset your text buffer on each flush.
source filtering matters — buffer source="model" text. Other sources ("system", "tool", plugin names) carry different content (tool output, system messages) that may need separate handling.

Common Pitfalls

These are real bugs encountered in production client implementations:

Flush source is "system", not "model" — The SDK emits flush as on_output("system", "", "flush"). If your client checks source == "model" before checking mode == "flush", the flush signal is silently dropped and text is never finalized before tool execution. Always check mode == "flush" before filtering on source.
TurnCompletedEvent is NOT terminal — In multi-turn agentic flows (model responds → calls tool → model responds again), multiple TurnCompletedEvents are emitted before the response is complete. Do not break your event loop on TurnCompletedEvent.
The main agent's completion signal is AgentStatusChangedEvent(status="done"|"idle"), NOT AgentCompletedEvent — The server only emits AgentCompletedEvent for subagents. For the main agent, AgentStatusChangedEvent with status="done" (all work complete) or status="idle" (waiting for next user input) is the terminal event. If your event loop only breaks on AgentCompletedEvent, it will hang forever on main agent interactions.
Guard against empty text before sending — Flush can fire before any model text arrives (when the model's first action is a tool call with no preamble). If your client sends/edits a message on every flush, an empty accumulated buffer will cause errors (e.g., Telegram rejects empty messages). Check that accumulated text is non-empty before sending.
Permission events can arrive without preceding model text — The model may invoke a tool immediately without saying anything first. Your client should handle PermissionInputModeEvent arriving before any AgentOutputEvent(source="model") — don't assume there is always text to flush before a permission placeholder.

Back to Enterprise Overview