Documentation

Everything you need to observe your AI agents with CAUM.

Quick Start

Get CAUM observing your agents in under 60 seconds.

1. Install

pip install caum

2. Get your API key

Register at caum.systems/pilot or via API:

curl -X POST https://caum-observation-production.up.railway.app/v1/register \
  -H "Content-Type: application/json" \
  -d '{"email": "you@company.com", "company": "Your Company"}'

3. Start observing

Option A: Zero-code wrapper (recommended)

Wrap your existing Anthropic or OpenAI client. Your agent runs exactly the same. CAUM observes automatically.

# Anthropic
import caum, anthropic

caum.init("caum_live_YOUR_KEY")
client = caum.wrap_anthropic(anthropic.Anthropic())

# Use client exactly as before. CAUM observes every tool call.
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    messages=[...],
    tools=[...],
)
# OpenAI
import caum, openai

caum.init("caum_live_YOUR_KEY")
client = caum.wrap_openai(openai.OpenAI())

# Use client exactly as before.
response = client.chat.completions.create(
    model="gpt-5",
    messages=[...],
    tools=[...],
)

Option B: Manual observation

Call caum.observe() after each tool call in your agent:

import caum

caum.init("caum_live_YOUR_KEY")

caum.observe("session-1", tool="bash", content="pytest tests/")
caum.observe("session-1", tool="edit", content="Fixed the bug")

result = caum.end("session-1")
print(result["tier"])  # T1-T5

Option C: Raw API (curl)

curl -X POST https://caum-observation-production.up.railway.app/v1/step \
  -H "Authorization: Bearer caum_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"session_id": "task-1", "step": 1, "tool": "bash", "content": "pytest tests/"}'

Response:

{
  "regime": "EXPLORER",
  "failure_risk": 0.23,
  "uds": 0.77,
  "tier": "T2",
  "advisory": null,
  "merkle_root": "a7c3e1f9...",
  "alert_fired": false,
  "observation": "passive"
}

4. Finalize when done

result = caum.end("session-1", resolved=True)
# or via curl:
# POST /v1/session/session-1/end
CAUM is passive.

CAUM observes and notifies. CAUM never controls agent execution. If CAUM goes down, your agent keeps running. Policy actions are at your discretion.

What CAUM detects

RegimeMeaningYour agent is...
EXPLORERHealthyUsing diverse tools, making progress
GRINDDegradedRepeating similar actions
LOOPCriticalStuck in a repetitive cycle
REASONING_LOOPCriticalThinking in circles, not acting
STAGNATIONDegradedSame tool, same result, over and over

API Reference

Base URL: https://caum-observation-production.up.railway.app

MethodEndpointAuthDescription
POST/v1/registerNoneCreate account, get API key
POST/v1/stepBearerObserve a tool call (core endpoint)
GET/v1/sessionsBearerList all sessions
GET/v1/session/{id}BearerSession detail with all steps
POST/v1/session/{id}/endBearerFinalize session, get full report
POST/v1/alertsBearerCreate alert rules
GET/v1/alerts/historyBearerList fired alerts
GET/v1/usageBearerUsage stats and trial status
POST/v1/checkoutBearerStripe subscription checkout

POST /v1/step — Request Body

FieldTypeRequiredDescription
session_idstringYesYour identifier for this session
stepintegerYesStep number (1-indexed)
toolstringYesTool name (bash, code_editor, etc.)
contentstringNoTool output summary (improves detection)
agent_namestringNoAgent identifier
teamstringNoTeam that owns the agent
taskstringNoTask description
tokens_usedintegerNoLLM tokens consumed
cost_usdfloatNoCost of this step
state_hashstringNoSHA-256 of tool output

Python SDK

Minimal wrapper for any Python agent:

import requests, hashlib

class CaumObserver:
    def __init__(self, api_key, session_id, agent_name=""):
        self.api_key = api_key
        self.session_id = session_id
        self.agent_name = agent_name
        self.step = 0
        self.base = "https://caum-observation-production.up.railway.app"
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json",
        }

    def observe(self, tool, content, cost=0):
        """Call after each tool use. Returns regime assessment."""
        self.step += 1
        resp = requests.post(f"{self.base}/v1/step", headers=self.headers, json={
            "session_id": self.session_id,
            "step": self.step,
            "tool": tool,
            "content": content[:500],
            "agent_name": self.agent_name,
            "cost_usd": cost,
            "state_hash": hashlib.sha256(content.encode()).hexdigest()[:16],
        })
        return resp.json()

    def end(self):
        """Finalize session. Call when agent completes."""
        resp = requests.post(
            f"{self.base}/v1/session/{self.session_id}/end",
            headers=self.headers,
        )
        return resp.json()

# Usage
caum = CaumObserver("caum_live_YOUR_KEY", "fix-auth-bug", "Claude Sonnet")

# After each tool call in your agent loop:
result = caum.observe("bash", "pytest tests/ -v -- 5 passed", cost=0.03)
print(result["regime"])  # EXPLORER
print(result["uds"])     # 0.88

# When done:
report = caum.end()
print(report["tier"])    # T2

Webhooks

Configure alert rules to receive webhooks when CAUM detects degradation:

curl -X POST https://caum-observation-production.up.railway.app/v1/alerts \
  -H "Authorization: Bearer caum_live_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '[{
    "condition": "tier>=T4",
    "action_type": "webhook",
    "destination": "https://your-server.com/caum-alert"
  }]'

Webhook Payload

{
  "event": "caum.advisory",
  "agent": "Claude Sonnet",
  "team": "Backend",
  "task": "Fix auth service bug",
  "session_id": "fix-auth-bug",
  "step": 14,
  "regime": "LOOP",
  "uds": 0.22,
  "tier": "T4",
  "total_steps": 14,
  "total_cost": 0.42,
  "message": "CAUM Advisory: LOOP at step 14, UDS=0.22, Tier=T4",
  "dashboard_url": "https://caum.systems/dashboard/fix-auth-bug",
  "timestamp": "2026-04-03T..."
}

Supported Alert Types

TypeDestinationFormat
webhookAny HTTPS URLJSON POST
slackSlack webhook URLBlock Kit message
emailEmail addressHTML email via Resend

Alert Conditions

ConditionTriggers when
tier>=T4Session enters T4 or T5
tier>=T3Session enters T3, T4, or T5
tier=T5Session enters T5 only
regime=LOOPAgent enters LOOP regime
regime=GRINDAgent enters GRIND regime
uds<0.3UDS drops below 0.3
uds<0.5UDS drops below 0.5

Regime Definitions

CAUM classifies each step into a behavioral regime:

RegimeMeaningHealthy?
WARMING_UPAgent is reading files and understanding the codebaseYes
EXPLORERAgent is making diverse, productive tool callsYes
EXPANSIONAgent is building new code or featuresYes
LINEAR_PROGRESSSteady forward movement on the taskYes
STABLEConsistent work pattern, no anomaliesYes
GRINDRepetitive tool use, low diversity, may be stuckWarning
LOOPAgent is repeating the same actionsNo
STAGNATIONAgent has stopped making progressNo
FUNNEL_TRAPAgent is narrowing into a dead endNo

Tier Definitions

UDS (Unified Detection Score) maps to tiers:

TierUDS RangeMeaning
T10.85 - 1.00Excellent. Diverse tools, real substance, productive work.
T20.70 - 0.84Good. Genuine work with minor structural issues.
T30.50 - 0.69Marginal. Some substance but tool monotony or short session.
T40.30 - 0.49Degraded. Low substance, repetitive patterns, possible loop.
T50.00 - 0.29Critical. Gaming, looping, or no productive work detected.

UDS is computed as: UDS = 0.30 * TCR + 0.40 * ESR + 0.30 * (SCI * ESR)

Authentication

All endpoints except /v1/register require a Bearer token:

Authorization: Bearer caum_live_YOUR_API_KEY

API keys are generated on registration and start with caum_live_.

Trial Limits

When trial expires, the API returns 402 with an upgrade link:

{
  "error": "trial_expired",
  "message": "Trial expired. Upgrade at caum.systems/pilot/",
  "upgrade_url": "https://caum.systems/pilot/"
}