Replay Runtime Pipeline

Every call through session.client passes through a 7-stage enforcement pipeline. Here's what happens at each stage and why.

This is a technical runtime doc. The primary product story is still zero-config governance plus Governance Studio.

The pipeline

Your code calls session.client.chat.completions.create()
                          │
                    ┌─────▼─────┐
             Stage 1│  NARROW   │  Remove tools the LLM shouldn't see
                    └─────┬─────┘
                    ┌─────▼─────┐
             Stage 2│ PRE-CHECK │  Check session limits before spending tokens
                    └─────┬─────┘
                    ┌─────▼─────┐
             Stage 3│ LLM CALL  │  Send to OpenAI / Anthropic (untouched)
                    └─────┬─────┘
                    ┌─────▼─────┐
             Stage 4│ VALIDATE  │  Check response against contracts
                    └─────┬─────┘
                    ┌─────▼─────┐
             Stage 5│   GATE    │  Block or strip illegal tool calls
                    └─────┬─────┘
                    ┌─────▼─────┐
             Stage 6│ FINALIZE  │  Update session state, advance phase
                    └─────┬─────┘
                    ┌─────▼─────┐
             Stage 7│ CAPTURE   │  Record decision for observability
                    └─────┴─────┘
                          │
            Response returned to your code

Stage 1: Narrow

When: Before the LLM call What: Removes tools from the request that the model shouldn't see

The narrowing stage filters the tool list based on:

Phase restrictions — tools not valid in the current phase are removed
Preconditions — tools whose prerequisites haven't been met are removed
Forbidden tools — tools blocked by a prior forbids_after are removed
Unmatched tools — tools with no contract (when unmatchedPolicy: "block")
Policy — tools denied by the principal's authorization rules
Manual filter — tools excluded by session.narrow()

Why this matters: The model literally cannot request a tool it can't see. This is more effective than validating after the fact — the model never even considers the illegal option.

Callback: Use onNarrow to see what was removed and why.

Stage 2: Pre-check

When: Before the LLM call (after narrowing) What: Checks session-level limits

Checked in order:

Is the session killed?
Has max_steps been reached?
Has max_cost_per_session been exceeded?

If any check fails, the call is blocked before spending tokens on an LLM request.

Stage 3: LLM call

When: After pre-check passes What: Sends the (narrowed) request to OpenAI or Anthropic

The SDK passes the request to the provider untouched (except for the narrowed tool list). No prompt injection, no system message modification, no argument rewriting. The LLM sees exactly what you sent, minus the removed tools.

Stage 4: Validate

When: After the LLM responds What: Checks the response against contracts

For each tool call in the response:

Contract match — does a contract exist for this tool?
Assertions — do input/output invariants pass?
Argument values — do runtime argument checks pass? (gte, lte, regex, etc.)
Response format — is finish_reason correct? Are tool calls actually present?
Phase transition — is the proposed transition legal?
Preconditions — are cross-step dependencies satisfied (with actual arguments)?
Forbidden tools — is this tool in the forbidden set?
Per-tool limits — has max_calls_per_tool been reached?
Loop detection — same tool+args repeated beyond threshold?
Policy — does the principal have authorization?

Each failed check produces a block reason (e.g., argument_value_mismatch, precondition_not_met).

Stage 5: Gate

When: After validation What: Decides what to do with blocked tool calls

The gate mode controls behavior:

Gate	Behavior
`reject_all`	Throw `ReplayContractError` if any call is blocked
`strip_partial`	Remove blocked calls, return valid ones; throw if ALL blocked
`strip_blocked`	Remove blocked calls; synthesize text-only response if ALL blocked

In mode: "shadow", this stage computes the decision but does not apply it — the original response is returned unmodified. That mode still exists in the runtime API, but it is not the primary public product path.

Stage 6: Finalize

When: After the gate decision What: Updates session state

Advances currentPhase if a tool's advances_to triggered a transition
Increments totalStepCount, totalToolCalls, toolCallCounts
Adds tools to forbiddenTools if forbids_after fired
Records satisfied preconditions for downstream tools
Updates cost tracking (totalCost, actualCost)
Resets or increments consecutiveBlockCount

In Govern mode, this is where the server round-trip happens — the receipt is submitted and the server makes the authoritative commit decision.

Stage 7: Capture

When: After finalization What: Records the enforcement decision

Every call produces a capture record containing:

The enforcement decision (allow/block) with reasons
What was removed by narrowing (counterfactual)
What was blocked by the gate (counterfactual)
Current phase and phase transition
Performance timing (guard_overhead_ms)

Captures pass through SecurityGate redaction before storage — API keys, tokens, and PII are scrubbed automatically.

Govern mode adds three server round-trips

When connected to the Vesanor server (Govern mode), three additional round-trips wrap stages 3-6:

         ┌──────────────┐
         │   PREFLIGHT   │  Register request with server, get prepared_request_id
         └──────┬───────┘
         ┌──────▼───────┐
         │   LLM CALL    │  (same as Stage 3)
         └──────┬───────┘
         ┌──────▼───────┐
         │   PROPOSAL    │  Submit response to server for evaluation
         └──────┬───────┘
         ┌──────▼───────┐
         │  EXECUTION    │  Run tool executor (if tools provided)
         └──────┬───────┘
         ┌──────▼───────┐
         │   RECEIPT     │  Submit execution evidence, server commits
         └──────┴───────┘

This creates durable, authoritative state that survives process crashes. See Govern Mode for details.

What the pipeline does NOT do

No prompt modification — your messages are sent verbatim
No response rewriting — tool call arguments are never modified (blocked calls are removed, not edited)
No LLM in the governance path — all decisions are deterministic contract evaluation
No network calls in Protect mode — everything runs in-process

Next steps

Quickstart — see the pipeline in action
Govern Mode — enable server-backed state
Runtime States — understand health/authority outcomes
Approval Model — understand what the runtime attaches to

The pipeline​

Stage 1: Narrow​

Stage 2: Pre-check​

Stage 3: LLM call​

Stage 4: Validate​

Stage 5: Gate​

Stage 6: Finalize​

Stage 7: Capture​

Govern mode adds three server round-trips​

What the pipeline does NOT do​

Next steps​