Session Limits

Session limits protect you from runaway agents. They cap total steps, cost, tool calls, and detect loops — catching the $47K recursive loop problem before it starts.

The problem

Four LangChain agents looped for 11 days. Each individual call was under 200ms, under token limits. Monitoring said "SYSTEM NOMINAL." Total cost: $47,000.

Per-call validation can't catch loops. Each call is valid. The problem is the cumulative count — and no one was counting.

Defining session limits

Add session_limits to your session.yaml:

# session.yaml
schema_version: "1.0"
agent: my-agent

session_limits:
  max_steps: 20                        # Max LLM calls
  max_tool_calls: 50                   # Max tool calls across all steps
  max_cost_per_session: 10.00          # Dollar cap
  max_calls_per_tool:
    issue_refund: 1                    # Only 1 refund per session
    send_email: 5                      # Max 5 emails
  loop_detection:
    window: 5                          # Look at last 5 steps
    threshold: 3                       # Same tool+args 3x = blocked
  circuit_breaker:
    consecutive_blocks: 5              # Auto-kill after 5 consecutive blocks
    consecutive_errors: 3              # Auto-kill after 3 consecutive errors

session.yaml must compile successfully

Session limits defined in session.yaml only work if the file compiles successfully. If your session.yaml has a compilation error (e.g., invalid phases or undeclared transition targets), the session is blocked and limits are not silently ignored — create() throws ReplayConfigError.

If you see replay_compile_error in your diagnostics, fix the compilation error first. Limits, phases, loop detection, and policy all depend on successful compilation.

Step and tool call limits

`max_steps`

Maximum number of LLM calls in the session. Checked before each call (Stage 2). If exceeded, the call is blocked with session_limit_exceeded.

session_limits:
  max_steps: 20

Counting: Uses totalStepCount which is monotonic — it counts committed steps and never decreases, even if older steps are evicted from memory for long sessions.

`max_tool_calls`

Maximum total tool calls across all steps. A single LLM response can contain multiple tool calls — this counts each one.

session_limits:
  max_tool_calls: 50

`max_tool_calls_mode`

Controls what happens when max_tool_calls is reached. Default is "block".

Mode	Behavior
`block`	Hard-block the session. No more LLM calls. (default)
`narrow`	Narrow the tool set to only tools with remaining `max_calls_per_tool` budget. If none remain, block.

session_limits:
  max_tool_calls: 15
  max_tool_calls_mode: narrow
  max_calls_per_tool:
    collect_forensic_image: 3
    containment_scan: 2

When to use narrow: Multi-phase workflows where the LLM fires multiple tools per turn, consuming the global budget faster than expected. Without narrow, tools reserved for later phases (like forensic collection) become unreachable once the global cap is hit — even if they've never been called.

How it works:

When totalToolCalls >= max_tool_calls, instead of blocking, the engine filters the visible tool set to only tools that have an explicit max_calls_per_tool entry with remaining budget
Tools without a max_calls_per_tool entry are excluded (no explicit budget = not reachable past the cap)
Stage 3 per-tool limits still apply after the LLM responds
If max_steps or max_cost_per_session is also exceeded, the session blocks regardless of mode

max_tool_calls becomes a soft cap in narrow mode

In narrow mode, total tool calls can exceed max_tool_calls by the sum of remaining per-tool budgets. In the example above, the real ceiling is 15 + 3 + 2 = 20 in the worst case. Set max_tool_calls with this overshoot in mind.

`max_calls_per_tool`

Per-tool call limits. Different tools can have different limits.

session_limits:
  max_calls_per_tool:
    issue_refund: 1          # Exactly 1 refund allowed
    send_email: 5            # Max 5 emails
    search_orders: 10        # Max 10 searches

Tools not listed have no per-tool limit. The issue_refund: 1 limit is a common pattern for idempotent operations (complements forbids_after).

Cost limits

`max_cost_per_session`

Dollar cap for the entire session. Computed from token usage reported by the provider.

session_limits:
  max_cost_per_session: 10.00

How it works:

Cost is tracked in actualCost — updated immediately after every LLM call, including blocked and retried calls
This is a soft cap — the call that pushes past the threshold already ran (it was billed). The next call is blocked.
actualCost includes all calls. totalCost only includes committed steps. Limits check actualCost.

Check it at runtime:

const state = session.getState();
console.log(`Cost so far: $${state.actualCost.toFixed(4)}`);

Loop detection

Catches repeated identical calls that individually look healthy.

session_limits:
  loop_detection:
    window: 5              # Look at the last 5 steps
    threshold: 3           # If same (tool, arguments) appears 3 times → block

How it works:

After each LLM call, extract (tool_name, arguments_hash) tuples
Look at the last window steps
Count occurrences of each tuple
If any tuple appears threshold or more times → block with loop_detected

Example: Agent calls search_orders({ query: "pending" }) three times in a row. Each call is valid. But the third call is blocked because the same tool with the same arguments appeared 3 times in the last 5 steps.

What it catches:

Recursive retry loops (the $47K incident)
Stuck agents repeating the same action
Models that ignore "no results found" and keep searching

What it doesn't catch:

Loops with slightly different arguments (different hash)
Loops with different tool names (different tuple)
Slow loops outside the window

Circuit breaker

Auto-kills the session after too many consecutive failures. Prevents cascading failure when something is fundamentally wrong.

session_limits:
  circuit_breaker:
    consecutive_blocks: 5         # 5 blocked calls in a row → auto-kill
    consecutive_errors: 3         # 3 internal errors in a row → auto-kill

How it works:

consecutiveBlockCount increments on each blocked call, resets to 0 on any non-block
consecutiveErrorCount increments on each internal error, resets to 0 on any non-error
When either threshold is hit, session.kill() is called automatically
After auto-kill, all subsequent calls throw ReplayKillError

Why this matters: Without a circuit breaker, a misconfigured agent can burn through your LLM budget retrying blocked calls forever. The circuit breaker stops it.

How limits interact

Limits are checked in this order during Stage 2 (Pre-check):

Kill check — is the session already killed?
Step limit — totalStepCount >= max_steps?
Tool call limit — totalToolCalls >= max_tool_calls? (if max_tool_calls_mode: narrow, narrows instead of blocking)
Cost limit — actualCost >= max_cost_per_session?
Per-tool limit — (checked after LLM response, before gate)
Loop detection — (checked after LLM response, before gate)
Circuit breaker — (checked after each decision outcome)

If any check fails, the call is blocked with session_limit_exceeded (or loop_detected). The exception is max_tool_calls_mode: narrow, which narrows the tool set instead of blocking (see max_tool_calls_mode).

Checking limits at runtime

const state = session.getState();

console.log("Steps:", state.totalStepCount);         // Committed steps
console.log("Tool calls:", state.totalToolCalls);     // Total tool calls
console.log("Cost:", state.actualCost);               // All LLM calls (including blocked)
console.log("Per-tool:", state.toolCallCounts);       // { issue_refund: 1, search: 3 }
console.log("Blocks:", state.totalBlockCount);        // Total blocked calls
console.log("Consecutive blocks:", state.consecutiveBlockCount);
console.log("Killed:", state.killed);                 // true if auto-killed

Recommended defaults

For most agents:

session_limits:
  max_steps: 20
  max_tool_calls: 50
  max_cost_per_session: 5.00
  loop_detection:
    window: 5
    threshold: 3
  circuit_breaker:
    consecutive_blocks: 5
    consecutive_errors: 3

For high-risk agents (payments, infrastructure):

session_limits:
  max_steps: 10
  max_tool_calls: 15
  max_cost_per_session: 1.00
  max_calls_per_tool:
    process_payment: 1
    delete_resource: 1
  loop_detection:
    window: 3
    threshold: 2
  circuit_breaker:
    consecutive_blocks: 3
    consecutive_errors: 2

Next steps

Kill Switch — manual emergency stop
Phases & Transitions — combine with phase control
Preconditions & Ordering — enforce tool ordering

The problem​

Defining session limits​

Step and tool call limits​

max_steps​

max_tool_calls​

max_tool_calls_mode​

max_calls_per_tool​

Cost limits​

max_cost_per_session​

Loop detection​

Circuit breaker​

How limits interact​

Checking limits at runtime​

Recommended defaults​

Next steps​

The problem

Defining session limits

Step and tool call limits

`max_steps`

`max_tool_calls`

`max_tool_calls_mode`

`max_calls_per_tool`

Cost limits

`max_cost_per_session`

Loop detection

Circuit breaker

How limits interact

Checking limits at runtime

Recommended defaults

Next steps