Skip to main content

What is Vesanor?

Vesanor is an LLM function-calling reliability platform. It validates, monitors, and governs the tool calls your AI agents make — across providers like OpenAI and Anthropic — so that when a model silently changes behavior, your app doesn't break.


The problem

LLM-powered agents work by calling tools: looking up customers, issuing refunds, deploying services, querying databases. These tool calls are your agent's actions in the real world — and they're the most dangerous part of your AI stack.

The problem is that models change without warning:

  • Silent model updates — your provider ships a new version and your agent starts skipping a required tool, or calling tools in the wrong order
  • Cross-provider drift — you switch from OpenAI to Anthropic and discover they format arguments differently, or make different tool choices for the same prompt
  • No regression tests — traditional unit tests don't cover LLM behavior. You can't write expect(model.response).toBe(...) because responses are non-deterministic
  • Runtime surprises — your agent calls delete_account when it should have called suspend_account, and you only find out from a customer complaint

There's no way to know if your agent's tool-calling behavior is correct, consistent, or safe — unless you have a system that checks every call against a contract.


What Vesanor does

Vesanor gives you a Studio-first governance workflow for agent tool use.

At a high level, it does three things:

  • builds a governed view of how your agent is behaving
  • lets you review and approve that behavior in Governance Studio
  • enforces approved runtime behavior through replay()

Studio-first governance

For most users, the main path is:

  • replay(client, { apiKey })
  • Governance Studio
  • zero-config governance

That path avoids making users author and maintain large local YAML packs.

Runtime workflow governance

Control what your agent can do on governed workflow paths in production:

  • Block illegal tool calls before they execute — not after
  • Phase-based state machine — the model only sees tools valid in its current workflow phase
  • Preconditions — "you must call verify_identity before issue_refund"
  • Session limits — cost caps, call limits, loop detection, circuit breakers
  • Kill switch — immediately halt any session

Operational visibility

Vesanor also gives you operational surfaces around governed behavior:

  • References / baselines — trust posture and drift
  • Runs — runtime and evaluation history
  • Fingerprinted failures — stable failure identities across repeated regressions

Who is Vesanor for?

Engineering teams shipping LLM agents to production. If your AI calls tools — APIs, databases, external services — and you need confidence that it does the right thing reliably, Vesanor is for you.

Common use cases:

  • Support agents that look up customers, check eligibility, and issue refunds
  • DevOps agents that deploy services, check health, and create incident tickets
  • Compliance agents that classify documents, run checks, and generate reports
  • Security agents that triage alerts, scan systems, and isolate threats

If your agent's tool calls have real-world consequences and follow repeatable workflows, you need contracts.


What makes Vesanor different

Contracts, not vibes

Most LLM testing is qualitative — "does this response look right?" Vesanor uses explicit structural rules around tools, phases, limits, and workflow state. Those rules may still compile down to contract-like runtime artifacts, but the primary product surface is Studio, not hand-authored YAML.

Studio-first review

You review behavior in Governance Studio: workspace drafts, judgments, impact preview, approval preview, and conformance findings. Approval then freezes the runtime artifact used by zero-config governance.

Provider-agnostic

Vesanor is designed to govern tool behavior across providers. The underlying enforcement model stays structured even when providers differ in request and response shape.

Two-lane CI

The repo still contains a local-pack CI path, but that is no longer the main public product story. The primary path is Studio-backed runtime governance plus operational review surfaces.

Runtime workflow governance

replay() catches runtime surprises. It wraps your LLM client and enforces approved workflow behavior on every call — blocking illegal tool calls, enforcing workflow phases, and providing a kill switch for runaway agents. Three protection levels remain: Monitor, Protect, and Govern.

Fingerprinted failures

Every failure gets a short hash (fingerprint) that stays stable when the same issue recurs. Instead of a wall of logs, you see unique failure patterns with trend lines. Regressions are immediately obvious.


How the pieces fit together

                     Your LLM Agent
|
replay()
|
+-----------+-----------+
| |
v v
Governance Studio Runtime Enforcement
(review, judgment, (phases, limits,
preview, approval) blocking, kill)
|
v
Operational Surfaces
(runs, references, shadow, models)
  1. Integrate — wrap your client with replay()
  2. Review — inspect the workspace draft in Governance Studio
  3. Approve — freeze the runtime artifact for that workspace/environment
  4. Enforce — let the SDK attach to approved governance at runtime
  5. Operate — use Runs and References to monitor drift and regressions

Next steps

Get started fast:

Go deeper: