IPC Connection Recovery

When a client connects to the JAATO server via IPC, the connection may be interrupted by server restarts, crashes, or network issues. The IPC recovery mechanism provides automatic reconnection with exponential backoff, preserving the user's session and conversation state.

Practical integration guide: For a step-by-step guide on using IPCRecoveryClient in your client code, see the Connection Recovery Guide.
IPC Recovery Infography
Click to open full-size image in a new tab

Connection State Machine

The recovery system is modeled as a state machine with six states:

StateDescription
DISCONNECTEDInitial state, or recovery gave up after max attempts
CONNECTINGAttempting initial connection to the server
CONNECTEDActive connection, events flowing normally
RECONNECTINGConnection lost, automatic recovery in progress
DISCONNECTINGGraceful disconnect initiated by client
CLOSEDTerminal state, no more connection attempts

Key Transitions

FromToTrigger
DISCONNECTEDCONNECTINGconnect() called
CONNECTINGCONNECTEDSuccessful handshake
CONNECTEDRECONNECTINGConnection lost (reset, timeout, etc.)
RECONNECTINGCONNECTINGBackoff timer fires
RECONNECTINGCLOSEDMax attempts exceeded

Exponential Backoff

The recovery mechanism uses exponential backoff with jitter to avoid thundering herd problems when multiple clients reconnect simultaneously:

delay = min(max_delay, base_delay * 2^(attempt - 1))
jitter = delay * jitter_factor * random(-1, 1)
final_delay = max(0.1, delay + jitter)

With default configuration (base_delay=1.0, max_delay=60.0, jitter_factor=0.3):

AttemptBase DelayWith Jitter (±30%)
11.0s0.7s – 1.3s
22.0s1.4s – 2.6s
34.0s2.8s – 5.2s
48.0s5.6s – 10.4s
516.0s11.2s – 20.8s
632.0s22.4s – 41.6s
7+60.0s (capped)42.0s – 78.0s

Configuration

Recovery behavior is configurable via JSON files and environment variables. Configuration is loaded and merged in precedence order: environment variables > project config (.jaato/client.json) > user config (~/.jaato/client.json) > built-in defaults.

OptionDefaultDescription
enabledtrueEnable automatic reconnection
max_attempts10Maximum reconnection attempts before giving up
base_delay1.0Initial backoff delay in seconds
max_delay60.0Maximum backoff delay (caps exponential growth)
jitter_factor0.3Random jitter range (0.3 = ±30%)
connection_timeout5.0Timeout for each connection attempt
reattach_sessiontrueAuto-reattach to previous session after reconnect

Session Preservation

After successful reconnection, the client sends a session.attach command with the stored session ID. The server loads the session from disk (if evicted from memory), sends a SessionInfoEvent with full state, and the client continues normal operation.

What is preserved

  • Session ID for reattachment
  • Conversation history (persisted on server disk)
  • Tool states (managed by server)

What is lost

  • Active IPC connection (replaced by new connection)
  • In-flight requests (pending permission responses, etc.)
  • Real-time event stream (restarted after reconnect)

Error Classification

The recovery mechanism classifies errors to determine whether to retry:

CategoryExamplesAction
Transient (will retry) ConnectionRefusedError, ConnectionResetError, asyncio.TimeoutError Retry with backoff
Permanent (no retry) FileNotFoundError (socket deleted), "Permission denied", "Authentication failed" Transition to CLOSED
Back to Enterprise Overview