What is Vesanor?
Vesanor is an LLM function-calling reliability platform. It validates, monitors, and governs the tool calls your AI agents make — across providers like OpenAI and Anthropic — so that when a model silently changes behavior, your app doesn't break.
The problem
LLM-powered agents work by calling tools: looking up customers, issuing refunds, deploying services, querying databases. These tool calls are your agent's actions in the real world — and they're the most dangerous part of your AI stack.
The problem is that models change without warning:
- Silent model updates — your provider ships a new version and your agent starts skipping a required tool, or calling tools in the wrong order
- Cross-provider drift — you switch from OpenAI to Anthropic and discover they format arguments differently, or make different tool choices for the same prompt
- No regression tests — traditional unit tests don't cover LLM behavior. You can't write
expect(model.response).toBe(...)because responses are non-deterministic - Runtime surprises — your agent calls
delete_accountwhen it should have calledsuspend_account, and you only find out from a customer complaint
There's no way to know if your agent's tool-calling behavior is correct, consistent, or safe — unless you have a system that checks every call against a contract.
What Vesanor does
Vesanor gives you contracts — declarative YAML rules that define what your agent should do. Then it enforces them at two levels:
CI-time validation
Catch regressions before they ship:
- Auto-generate contracts from real traffic — wrap your client with
observe()and Vesanor builds contracts from what the model actually does - Run contracts in CI — deterministic, offline, free. Recorded fixtures replay in milliseconds with no API calls
- Two-lane CI — Lane A blocks merges (recorded fixtures, deterministic). Lane B runs live models for advisory evidence
- Track every failure — each failure gets a fingerprint that stays the same when the same issue recurs. See trends, not noise
Runtime workflow governance
Control what your agent can do on governed workflow paths in production:
- Block illegal tool calls before they execute — not after
- Phase-based state machine — the model only sees tools valid in its current workflow phase
- Preconditions — "you must call
verify_identitybeforeissue_refund" - Session limits — cost caps, call limits, loop detection, circuit breakers
- Kill switch — immediately halt any session
Who is Vesanor for?
Engineering teams shipping LLM agents to production. If your AI calls tools — APIs, databases, external services — and you need confidence that it does the right thing reliably, Vesanor is for you.
Common use cases:
- Support agents that look up customers, check eligibility, and issue refunds
- DevOps agents that deploy services, check health, and create incident tickets
- Compliance agents that classify documents, run checks, and generate reports
- Security agents that triage alerts, scan systems, and isolate threats
If your agent's tool calls have real-world consequences and follow repeatable workflows, you need contracts.
What makes Vesanor different
Contracts, not vibes
Most LLM testing is qualitative — "does this response look right?" Vesanor tests are deterministic and structural: did the model call the right tools, with the right arguments, in the right order? Contracts are YAML files with precise assertions. No fuzzy matching, no subjective scoring.
Observe, then enforce
You don't have to write contracts from scratch. observe() wraps your existing client, captures real tool calls, and auto-generates contracts on the dashboard. Review what was generated, tighten the rules, and promote to enforcement. The loop is: observe → promote → enforce.
Provider-agnostic
Write contracts once, run them against OpenAI, Anthropic, or any future provider. The same YAML, the same assertions, the same golden fixtures. Vesanor normalizes provider-specific response formats so your tests are portable.
Two-lane CI
Your merge gate runs against recorded fixtures — deterministic, offline, zero API cost. Live model testing runs in a separate advisory lane that never blocks merges. You get safety and evidence without coupling your CI to third-party API availability.
Runtime workflow governance
CI catches regressions. replay() catches runtime surprises. It wraps your LLM client and enforces contracts on every call — blocking illegal tool calls, enforcing workflow phases, and providing a kill switch for runaway agents. Three protection levels: Monitor (observe), Protect (enforce locally), Govern (durable server-backed state and evidence).
Fingerprinted failures
Every failure gets a short hash (fingerprint) that stays stable when the same issue recurs. Instead of a wall of logs, you see unique failure patterns with trend lines. Regressions are immediately obvious.
How the pieces fit together
Your LLM Agent
|
+-----------+-----------+
| |
observe() replay()
(passive capture) (workflow governance)
| |
v v
Auto-generate Enforce contracts
contracts on every call
| |
+----------+------------+
|
Dashboard
(review, promote, monitor)
|
+----------+------------+
| |
CI Gate Baselines
(recorded fixtures, (drift detection,
merge-blocking) fingerprints)
- Observe — wrap your client with
observe()or runvesanor observeto capture tool calls and auto-generate contracts - Review — inspect auto-generated contracts on the dashboard, tweak assertions, check coverage
- Promote — turn draft contracts into enforced truth contracts
- Test in CI — run contracts against recorded fixtures (Lane A, merge-blocking) and live models (Lane B, advisory)
- Govern at runtime — wrap your client with
replay()to block illegal tool calls before they execute - Monitor — track baselines, detect drift, see failure trends on the dashboard
Next steps
Get started fast:
- Quickstart — first test in 60 seconds, or wrap your real app in 5 minutes
- How Vesanor Works — key concepts before diving into code
Go deeper:
- SDK Integration —
observe(),validate(), and health monitoring - Replay Quickstart — workflow governance in 5 minutes
- CI Integration — add to your pipeline