Shadow Pilot Program

Find structural loops and wasted agent compute before teams repeat expensive runs.

CAUM runs beside your agents, groups work by business task, and produces an observation-only report with token waste, cost at risk, exact structural cycles, vault evidence, and zero semantic access.

The demo kit below was generated through the live CAUM production flow on April 29, 2026. The data is structural telemetry only: event types, tools, phases, counters, costs, latency, and structural hashes.

5/5live demo reports completed
1.38Mtoken-counted units observed
$2.0235structural cost at risk
$0.7082conservative savings signal

Five workflows. Same shadow observer.

Each report includes a PDF and a vault. The vault contains the Merkle manifest, canonical input, structural cycles, Pilot Meter artifact, and verification guide.

Developer Agent Code Review

Steps200
Alerted tasks1/2
Cost at risk$0.4135
Savings signal$0.1447

Support Ticket Agent

Steps190
Alerted tasks1/2
Cost at risk$0.3129
Savings signal$0.1095

RAG Document QA

Steps204
Alerted tasks1/2
Cost at risk$0.5390
Savings signal$0.1887

Report Generation Agent

Steps200
Alerted tasks1/2
Cost at risk$0.4350
Savings signal$0.1522

Workflow Automation Agent

Steps192
Alerted tasks1/2
Cost at risk$0.3231
Savings signal$0.1131

A pilot should prove money, not just telemetry.

CAUM is strongest when every event belongs to a business task: ticket, pull request, document, report, lead, or background job.

01

Collect structure

Send task IDs, event types, tool names, phases, status, tokens, latency, cost counters, and optional structural hashes.

02

Run shadow mode

Observe without blocking agents. CAUM separates core work from telemetry and control-plane noise.

03

Review alerts

Find exact structural cycles, repeated deadwork, burn without progress, and task-level cost exposure.

04

Compare weeks

Track cost per task, p95 cost, completion rate, quality pass rate, alert rate, and savings opportunity.

What CAUM needs

  • Task or session identifier
  • Event type, tool, phase, and status
  • Token, cost, and latency counters when available
  • Optional state IDs or structural hashes
  • Enough tasks to compare normal work against alerted work

What CAUM does not need

  • No prompts
  • No completions
  • No source code
  • No customer messages
  • No private files or business payloads

Start with 500 real tasks.

The clean first claim is simple: CAUM observed real agent work in shadow mode, flagged a small percentage of structurally wasteful tasks, and gave the team a measurable cost-reduction target.

Plan pilot