Skip to main content

Workflow Governance

When a system uses multiple agents that hand off work to each other, single-session governance is not enough. Workflow governance coordinates multiple sessions under one durable workflow_id with explicit handoffs, shared resource protection, and cross-session budget limits.

This is an advanced runtime capability. It is not the primary public onboarding path, but it is part of the current repo.

Workflow governance rides on the Govern runtime path. In practice that means:

  • a Vesanor apiKey
  • wrapped state-bearing tools so the session can become govern-level
  • for a root workflow session, a compiled workflow artifact from workflow.yaml (discovered from contractsDir or passed with workflowYamlPath)

When you need workflow governance

Single-session replay() is sufficient when:

  • one agent handles the entire task
  • there is no delegation to other agents
  • there are no shared resources between concurrent processes

Workflow governance matters when:

  • an orchestrator delegates work to specialist agents
  • multiple agents might act on the same resource
  • you need to kill an entire agent tree at once
  • cross-agent budgets matter

The core idea

Important principle:

  • the session remains the unit of authority
  • the workflow coordinates handoffs, resources, and shared budgets
  • workflow governance does not merge mutable state from multiple agents into one giant shared session

That distinction is what keeps workflow coordination bounded and explainable.


Root and child sessions

Root session

The orchestrator starts a workflow by creating a root session. Root creation is the advanced path because the SDK needs a compiled workflow artifact:

const workflowId = "wf_pr42_review";

const session = replay(client, {
agent: "orchestrator",
mode: "enforce",
apiKey: process.env.VESANOR_API_KEY,
contractsDir: "./governance/orchestrator",
tools: {
review_pull_request: reviewPullRequest,
},
workflow: {
type: "root",
workflowId,
role: "orchestrator",
},
});

If no workflow.yaml can be discovered or loaded for a root session, the SDK does not attach the session to a durable workflow.

Handoff

After completing some work, the parent offers a handoff:

const parentSessionId = session.getState().sessionId;

const ticket = await session.handoff({
toRole: "code-scanner",
handoffId: "handoff-pr42-scan",
summary: { task: "Review PR #42", priority: "high" },
});

session.handoff() returns the handoff identifier and sequencing metadata. Your orchestration layer should already know the workflowId and parent session ID it is coordinating.

Child session

A child process claims the handoff by attaching to the workflow:

const childSession = replay(childClient, {
agent: "code-scanner",
mode: "enforce",
apiKey: process.env.VESANOR_API_KEY,
contractsDir: "./governance/code-scanner",
tools: {
scan_diff: scanDiff,
},
workflow: {
type: "child",
workflowId,
role: "code-scanner",
parentSessionId,
handoffId: ticket.handoffId,
},
});

Single-claim semantics still apply: once one child claims a handoff, competing claims fail.


Handoff lifecycle

offered -> claimed -> in_progress -> completed

Meaning:

  • offered — parent offered the handoff
  • claimed — a child attached and took ownership
  • in_progress — the child produced its first authoritative committed step
  • completed — the child finished

If a child claims a handoff but does not make progress, the handoff can be reclaimed and re-offered.

Reclaim fails after progress. Once the child has made authoritative progress, the child owns that handoff path.


Shared resources

Shared resources prevent conflicts when multiple workflow sessions act on the same entity.

Supported coordination modes in the current runtime model:

  • exclusive_pending
  • single_writer
  • serial_only

Examples:

  • one deployment environment should not have two unresolved deployment steps at once
  • one change request should not have two concurrent mutating owners
  • one migration should not have two authoritative commits racing each other

Workflow resource checks are evaluated against compiled workflow/session governance artifacts on the control plane.


Workflow limits

Workflow budgets apply across all sessions in the workflow, not just one session.

Examples of workflow-level limits:

  • maximum session count
  • maximum active session count
  • maximum total step count
  • maximum total cost
  • maximum open handoff count

When exceeded, the workflow path is blocked before more work is admitted.


Kill cascade

Workflow governance adds kill semantics above the session level.

Session kill

Kills one session.

Subtree kill

Kills one session and its descendants.

Workflow kill

Kills every active session in the workflow and rejects future handoff claims.

This is durable control-plane state, not a best-effort UI hint.


What child sessions inherit

A child session starts with its own mutable session state. Coordination is explicit.

Dependencies flow through:

  • handoff summaries
  • artifact references
  • workflow-scoped resource bindings

The important boundary is that workflow governance coordinates sessions. It does not turn multiple agents into one shared mutable session.


Runtime surfaces

The runtime exposes workflow state and handoff coordination through:

  • session.getWorkflowState()
  • session.handoff(...)
  • GET /api/v1/replay/workflows/:workflow_id
  • POST /api/v1/replay/sessions/:session_id/handoffs
  • POST /api/v1/replay/workflows/:workflow_id/handoffs/:handoff_id/reclaim
  • POST /api/v1/replay/workflows/:workflow_id/handoffs/:handoff_id/complete
  • POST /api/v1/replay/workflows/:workflow_id/kill

Those are the important current repo surfaces to understand.