Security & Evidence
What gets captured, what gets redacted, and what Replay's evidence does and does not prove in the replay() pipeline.
Capture redaction
Every capture passes through SecurityGate redaction before being stored or transmitted. The following patterns are automatically scrubbed:
| Pattern | Example | Redacted to |
|---|---|---|
| OpenAI API keys | sk-proj-abc123... | [REDACTED:openai_key] |
| Anthropic API keys | sk-ant-abc123... | [REDACTED:anthropic_key] |
| Vesanor API keys | vsn_abc123... | [REDACTED:vesanor_key] |
| Bearer tokens | Bearer eyJ... | [REDACTED:bearer] |
| Email addresses | [email protected] | [REDACTED:email] |
| PEM keys | -----BEGIN PRIVATE KEY----- | [REDACTED:pem_key] |
| Connection strings | postgresql://user:pass@host | [REDACTED:connection_string] |
| API key headers | x-api-key: abc123 | [REDACTED:api_key_header] |
Redaction happens before storage — secrets never reach the capture buffer, the network, or the server.
SDK-side redaction
The SDK has its own redaction implementation that runs in-process before captures are buffered. This matches the server-side SecurityGate pattern set. Both use the same shared pattern manifest to ensure consistency.
What about tool arguments?
Tool call arguments are included in captures but redacted. If your tool arguments contain secrets (API keys, passwords, tokens), SecurityGate catches them.
Best practice: Don't pass secrets as tool arguments. Use environment variables or secure credential stores in your tool executors instead.
Principal redaction
If you supply a principal identity to replay(), it's used internally for policy evaluation but never exposed externally:
const session = replay(client, {
principal: {
user_id: "agent-001",
department: "finance",
secret_token: "sk-proj-REAL_SECRET", // Will be redacted
},
});
// getState() redacts the principal
const state = session.getState();
console.log(state.principal); // null — always null in public snapshots
Why null? getState() returns a redacted snapshot safe to log, serialize, or display. The principal may contain sensitive identity data. Internally, policy evaluation uses the original value.
Evidence classes
Each tool contract declares its evidence requirements — how much proof is needed before the enforcement pipeline considers a step authoritative.
evidence_class
Describes the kind of evidence available:
| Class | Meaning | Example |
|---|---|---|
local_transaction | Tool executes locally with observable results | Database query, file read |
ack_only | Execution is acknowledged but not independently verified | API call to external service |
unverifiable | No way to verify execution happened | Fire-and-forget webhook |
commit_requirement
When authoritative session state advances:
| Requirement | When state advances |
|---|---|
acknowledged | After an execution receipt is recorded in the governed pipeline |
none | Tool is recorded but doesn't advance authoritative state |
The distinction matters in Govern mode. A tool with commit_requirement: acknowledged only advances session state after the server records a governed execution receipt. A tool with commit_requirement: none is tracked for audit purposes but doesn't affect the authoritative session record.
What Replay evidence proves
On the governed path, Replay can durably show that:
- A wrapped request was evaluated by Replay's policy engine
- The governed session was in a specific state version at decision time
- Replay allowed, blocked, or paused the call for a specific reason
- A wrapped tool path produced a governed execution receipt
This is useful for workflow review, debugging, approvals, and reconstructing what Replay believed happened.
What Replay evidence does not prove
Replay evidence is not the same thing as an independent external audit log. By itself it does not prove that:
- The application never bypassed the wrapper
- The external system accepted or completed the operation
- The final state of the external system matches Replay's session state
- Replay replaced IAM, sandboxing, or API-level business-rule enforcement
What gets captured
Every call through session.client produces a capture record:
{
// Standard capture fields (same as observe())
provider: "openai",
model: "gpt-4o-mini",
request: { /* redacted request */ },
response: { /* redacted response */ },
// Replay-specific fields
replay: {
session_id: "sess_abc123",
step_index: 2,
mode: "enforce",
decision: { action: "allow", tool_calls: [...] },
// Governance context
contract_hashes: ["sha256:abc..."],
state_version: 3,
commit_tier: "strong", // "strong" | "compat"
phase: "customer_identified",
phase_transition: "eligibility_checked",
// What was prevented and why
counterfactual: {
tools_removed: [
{ tool: "issue_refund", reason: "wrong_phase" },
{ tool: "delete_all", reason: "no_contract" }
],
calls_blocked: []
},
// Performance
guard_overhead_ms: 0.4,
// Narrowing details
narrowing: {
allowed: [{ name: "check_eligibility" }],
removed: [{ tool: "issue_refund", reason: "wrong_phase" }]
},
// Shadow mode only
shadow_delta: null // ShadowDelta | undefined
}
}
Counterfactual capture
The counterfactual field records what was prevented and why — not just what happened. This is unique to Vesanor:
tools_removed— tools the LLM never saw (narrowing)calls_blocked— tool calls the LLM proposed that were blocked (gating)
Workflow value: Counterfactual capture shows what Replay removed or blocked and why. It is useful for review, debugging, and explaining governed workflow decisions, but it is not by itself proof of final real-world side effects.
Bypass detection
In Govern mode, the SDK detects direct calls to the original client:
// This triggers bypass detection
const response = await originalClient.chat.completions.create({
model: "gpt-4o-mini",
messages,
tools,
});
When detected:
- A
replay_bypass_detecteddiagnostic event is emitted - Session is marked
compromisedon the server (via thereportBypassendpoint) - Future authoritative writes are rejected for this session
- The call still goes through (TypeScript can't revoke object references)
Bypass is authority revocation, not prevention. The SDK can't prevent you from using the original client. What it can do is mark the session as compromised so it can no longer make authoritative claims.
Compliance relevance
| Requirement | Vesanor feature |
|---|---|
| EU AI Act Article 14 — human oversight during use | session.kill() intervention capability |
| EU AI Act Article 12 — automatic event recording | Capture pipeline with full decision context |
| EU AI Act Article 19 — 6-month minimum log retention | Server-side capture storage with configurable retention |
| GDPR Article 22 — meaningful information about automated decisions | Counterfactual capture (what was prevented and why) |
| SOC 2 — privileged actions attributable to individuals | principal identity + governed execution receipts |
| NIST AI Agent Standards — runtime policy enforcement | Deterministic contract evaluation, not LLM-based |
These are ways Replay artifacts can support a broader control environment. They are not a claim that Replay alone satisfies any compliance regime.
Next steps
- Govern Mode — server-backed governed sessions with durable evidence
- Contract Cookbook — evidence_class and commit_requirement explained
- API Reference — capture format details