Session Limits
Session limits protect you from runaway agents. They cap total steps, cost, tool calls, and detect loops — catching the $47K recursive loop problem before it starts.
The problem
Four LangChain agents looped for 11 days. Each individual call was under 200ms, under token limits. Monitoring said "SYSTEM NOMINAL." Total cost: $47,000.
Per-call validation can't catch loops. Each call is valid. The problem is the cumulative count — and no one was counting.
Defining session limits
Add session_limits to your session.yaml:
# session.yaml
schema_version: "1.0"
agent: my-agent
session_limits:
max_steps: 20 # Max LLM calls
max_tool_calls: 50 # Max tool calls across all steps
max_cost_per_session: 10.00 # Dollar cap
max_calls_per_tool:
issue_refund: 1 # Only 1 refund per session
send_email: 5 # Max 5 emails
loop_detection:
window: 5 # Look at last 5 steps
threshold: 3 # Same tool+args 3x = blocked
circuit_breaker:
consecutive_blocks: 5 # Auto-kill after 5 consecutive blocks
consecutive_errors: 3 # Auto-kill after 3 consecutive errors
Session limits defined in session.yaml only work if the file compiles successfully. If your session.yaml has a compilation error (e.g., invalid phases or undeclared transition targets), the session is blocked and limits are not silently ignored — create() throws ReplayConfigError.
If you see replay_compile_error in your diagnostics, fix the compilation error first. Limits, phases, loop detection, and policy all depend on successful compilation.
Step and tool call limits
max_steps
Maximum number of LLM calls in the session. Checked before each call (Stage 2). If exceeded, the call is blocked with session_limit_exceeded.
session_limits:
max_steps: 20
Counting: Uses totalStepCount which is monotonic — it counts committed steps and never decreases, even if older steps are evicted from memory for long sessions.
max_tool_calls
Maximum total tool calls across all steps. A single LLM response can contain multiple tool calls — this counts each one.
session_limits:
max_tool_calls: 50
max_tool_calls_mode
Controls what happens when max_tool_calls is reached. Default is "block".
| Mode | Behavior |
|---|---|
block | Hard-block the session. No more LLM calls. (default) |
narrow | Narrow the tool set to only tools with remaining max_calls_per_tool budget. If none remain, block. |
session_limits:
max_tool_calls: 15
max_tool_calls_mode: narrow
max_calls_per_tool:
collect_forensic_image: 3
containment_scan: 2
When to use narrow: Multi-phase workflows where the LLM fires multiple tools per turn, consuming the global budget faster than expected. Without narrow, tools reserved for later phases (like forensic collection) become unreachable once the global cap is hit — even if they've never been called.
How it works:
- When
totalToolCalls >= max_tool_calls, instead of blocking, the engine filters the visible tool set to only tools that have an explicitmax_calls_per_toolentry with remaining budget - Tools without a
max_calls_per_toolentry are excluded (no explicit budget = not reachable past the cap) - Stage 3 per-tool limits still apply after the LLM responds
- If
max_stepsormax_cost_per_sessionis also exceeded, the session blocks regardless of mode
max_tool_calls becomes a soft cap in narrow modeIn narrow mode, total tool calls can exceed max_tool_calls by the sum of remaining per-tool budgets. In the example above, the real ceiling is 15 + 3 + 2 = 20 in the worst case. Set max_tool_calls with this overshoot in mind.
max_calls_per_tool
Per-tool call limits. Different tools can have different limits.
session_limits:
max_calls_per_tool:
issue_refund: 1 # Exactly 1 refund allowed
send_email: 5 # Max 5 emails
search_orders: 10 # Max 10 searches
Tools not listed have no per-tool limit. The issue_refund: 1 limit is a common pattern for idempotent operations (complements forbids_after).
Cost limits
max_cost_per_session
Dollar cap for the entire session. Computed from token usage reported by the provider.
session_limits:
max_cost_per_session: 10.00
How it works:
- Cost is tracked in
actualCost— updated immediately after every LLM call, including blocked and retried calls - This is a soft cap — the call that pushes past the threshold already ran (it was billed). The next call is blocked.
actualCostincludes all calls.totalCostonly includes committed steps. Limits checkactualCost.
Check it at runtime:
const state = session.getState();
console.log(`Cost so far: $${state.actualCost.toFixed(4)}`);
Loop detection
Catches repeated identical calls that individually look healthy.
session_limits:
loop_detection:
window: 5 # Look at the last 5 steps
threshold: 3 # If same (tool, arguments) appears 3 times → block
How it works:
- After each LLM call, extract
(tool_name, arguments_hash)tuples - Look at the last
windowsteps - Count occurrences of each tuple
- If any tuple appears
thresholdor more times → block withloop_detected
Example: Agent calls search_orders({ query: "pending" }) three times in a row. Each call is valid. But the third call is blocked because the same tool with the same arguments appeared 3 times in the last 5 steps.
What it catches:
- Recursive retry loops (the $47K incident)
- Stuck agents repeating the same action
- Models that ignore "no results found" and keep searching
What it doesn't catch:
- Loops with slightly different arguments (different hash)
- Loops with different tool names (different tuple)
- Slow loops outside the window
Circuit breaker
Auto-kills the session after too many consecutive failures. Prevents cascading failure when something is fundamentally wrong.
session_limits:
circuit_breaker:
consecutive_blocks: 5 # 5 blocked calls in a row → auto-kill
consecutive_errors: 3 # 3 internal errors in a row → auto-kill
How it works:
consecutiveBlockCountincrements on each blocked call, resets to 0 on any non-blockconsecutiveErrorCountincrements on each internal error, resets to 0 on any non-error- When either threshold is hit,
session.kill()is called automatically - After auto-kill, all subsequent calls throw
ReplayKillError
Why this matters: Without a circuit breaker, a misconfigured agent can burn through your LLM budget retrying blocked calls forever. The circuit breaker stops it.
How limits interact
Limits are checked in this order during Stage 2 (Pre-check):
- Kill check — is the session already killed?
- Step limit —
totalStepCount >= max_steps? - Tool call limit —
totalToolCalls >= max_tool_calls? (ifmax_tool_calls_mode: narrow, narrows instead of blocking) - Cost limit —
actualCost >= max_cost_per_session? - Per-tool limit — (checked after LLM response, before gate)
- Loop detection — (checked after LLM response, before gate)
- Circuit breaker — (checked after each decision outcome)
If any check fails, the call is blocked with session_limit_exceeded (or loop_detected). The exception is max_tool_calls_mode: narrow, which narrows the tool set instead of blocking (see max_tool_calls_mode).
Checking limits at runtime
const state = session.getState();
console.log("Steps:", state.totalStepCount); // Committed steps
console.log("Tool calls:", state.totalToolCalls); // Total tool calls
console.log("Cost:", state.actualCost); // All LLM calls (including blocked)
console.log("Per-tool:", state.toolCallCounts); // { issue_refund: 1, search: 3 }
console.log("Blocks:", state.totalBlockCount); // Total blocked calls
console.log("Consecutive blocks:", state.consecutiveBlockCount);
console.log("Killed:", state.killed); // true if auto-killed
Recommended defaults
For most agents:
session_limits:
max_steps: 20
max_tool_calls: 50
max_cost_per_session: 5.00
loop_detection:
window: 5
threshold: 3
circuit_breaker:
consecutive_blocks: 5
consecutive_errors: 3
For high-risk agents (payments, infrastructure):
session_limits:
max_steps: 10
max_tool_calls: 15
max_cost_per_session: 1.00
max_calls_per_tool:
process_payment: 1
delete_resource: 1
loop_detection:
window: 3
threshold: 2
circuit_breaker:
consecutive_blocks: 3
consecutive_errors: 2
Next steps
- Kill Switch — manual emergency stop
- Phases & Transitions — combine with phase control
- Preconditions & Ordering — enforce tool ordering