Evidence first

See CAUM catch a real structural loop.

Start here if you are new to CAUM. Inspect real reports and vaults, then run your own trace or start Live Meter. CAUM observes structure only: no semantic truth scoring, no hallucination claim, no agent blocking.

observe-only zero-semantic strict mode T4 review-only T5 critical structural evidence
Latest Production Validation
RunApril 30, 2026 against the live Railway API.
Agents5 real non-destructive validation agents: website smoke, backend tests, JSON ingestion, privacy probe, and controlled retry loop.
SignalControlled retry loop reached T5 and emitted live loop events. Non-control T4 results remain review-only.
BoundaryPrivate canary leaked: false. allowed_to_block: false. semantic truth judged: false.

What CAUM can defend publicly.

These numbers are conservative. They are not a hallucination score, not a failure prediction, and not a public waste rate.

1,350

Public traces

Replay corpus used for conservative structural evidence and regression checks.

54,100

Structural events

Event shapes, counters, hashes, status, phases, and cycle evidence.

71

Hard alerts

5.3% of traces reached the conservative public hard-alert boundary.

20

Critical T5

1.5% of traces showed critical structural evidence.

285

Strong exact cycles

21.1% had exact-cycle coverage at or above 0.50.

$40.58

Direct exposed cost

Observed cost field total, not a claimed savings number.

Inspect the receipts before you trust the pitch.

These sample reports are the clearest way to understand the product: a PDF for human review and a vault for audit evidence.

Dev Agent Code Review

Code-agent trace with structural review evidence and vault package.

Support Ticket Agent

Customer-support style workflow with structural cost exposure fields.

RAG Document QA

Retrieval workflow receipt without judging semantic correctness.

Workflow Automation Agent

Automation-style trace with task grouping and review artifacts.

Now run CAUM on your own agent work.

Use PDF Receipt for after-the-run evidence. Use Live Meter when you want structural monitoring while the agent runs.