Skip to content

Cost Attribution

rig-conductor tracks per-agent, per-repo LLM costs by accumulating TOKEN_USAGE events emitted by agents after each API call.

Endpoints

Endpoint Time scope Source
GET /api/usage (no days) All time Marten projection
GET /api/usage?days=N Rolling N days Raw events
GET /api/costs/summary?days=N Rolling N days Raw events
GET /api/costs/daily?days=N Rolling N days Raw events

Rule: to compare /api/usage with /api/costs/summary, always supply the same ?days=N to both. Without it /api/usage returns all-time totals which are naturally larger than any windowed summary.

Multi-tenancy (rc#1460)

Cost is attributed per tenant from each event's server-resolved tenant_id header (the multi-tenancy keystone, rc#1459). Tenant #0 is invotek.

  • ?tenantId= filter — optional on GET /api/costs/summary, /api/costs/issue, /api/costs/daily, and /api/usage. It restricts results to events whose display tenant (header, coalesced) equals the value, case-insensitively. Omitting it is byte-for-byte the pre-1460 behavior. /public/cost-summary stays tenant-blind (never leak tenant structure externally).
  • Backfill display — a pre-keystone (header-less) TOKEN_USAGE event coalesces to invotek for display, so ?tenantId=invotek includes legacy dashecorp spend. This reuses MartenEventStore.ReadTenantId.
  • All-time per-tenant usageGET /api/usage with a tenantId and no days scans raw events (not the stored TokenUsageProjection, which is tenant-blind in P0). The no-tenant all-time path still uses the fast projection. ⚠️ Perf: this is an O(N) type-filtered scan of all token_usage events with no event-type index (same shape as the existing windowed summary/daily scans) — fine at P0's modest volume, but prefer a ?days= window for large ranges. Phase 1 gives the projection a tenant dimension so all-time per-tenant reads stay on the fast path.
  • tenant=unknown is NOT a queryable tenant. Genuinely unattributable spend surfaces via the alarm below, not via ?tenantId=unknown (avoids implying "unknown" is a real tenant).

tenant=unknown alarm (TenantUnknownCostAlerter)

A background service posts to Discord #admin (DISCORD_ADMIN_WEBHOOK_URL) when a new TOKEN_USAGE event has real cost but no attributable tenant — i.e. its tenant_id header is absent/empty (TenantAttribution.IsUnknown), which is the inverse of the display coalesce. A literal "invotek" header is attributed and never alarms; only an absent/blank header does. This guards against unattributable spend silently becoming margin leak.

  • Skip-to-tip on cold startHeadersEnabled was only turned on at rc#1459, so every pre-keystone event is header-less; the alerter starts at the current event tip so it only ever alarms on new (post-keystone) unattributed events. Post-keystone every append stamps the header, so an absent header means a real bug/bypass.
  • Gated on EffectiveCost > 0 (zero-cost idle events never alarm) and deduped per event Sequence (Valkey 24h key + in-memory fallback). No-op when the webhook env is unset.

Cost Formula

For each TOKEN_USAGE event, the effective cost is computed by MartenCostQuery.EffectiveCost:

if all token counts = 0:
    cost = $0          # no LLM call occurred

elif cacheReadTokens > 0 OR cacheCreationTokens > 0:
    cost = AnthropicPricing.ComputeCost(model, input, output, cacheRead, cacheCreate)

else:
    cost = event.CostUsd   # trust agent-reported value (backward compat)

Anthropic Pricing Table

Maintained in ConductorE.Core/UseCases/AnthropicPricing.cs. Prices in USD/M tokens:

Model family Input Output Cache read Cache create
claude-opus-4-5 $15.00 $75.00 $1.50 $18.75
claude-sonnet-4-5 / claude-3-5-sonnet $3.00 $15.00 $0.30 $3.75
claude-haiku-4-5 / claude-3-5-haiku $0.80 $4.00 $0.08 $1.00
claude-3-opus $15.00 $75.00 $1.50 $18.75
claude-3-haiku $0.25 $1.25 $0.03 $0.30

Unknown models fall back to claude-sonnet-4-5 pricing.

TOKEN_USAGE Event Fields

{
  "type": "TOKEN_USAGE",
  "agentId": "dev-e-dotnet",
  "repo": "dashecorp/rig-conductor",
  "issueNumber": 148,
  "model": "claude-sonnet-4-5",
  "inputTokens": 10,
  "outputTokens": 4994,
  "cacheReadTokens": 160855,
  "cacheCreationTokens": 28927,
  "costUsd": 0.242194,
  "category": "work"
}

cacheReadTokens and cacheCreationTokens are optional (default 0). When present, the conductor recomputes cost from the price table, overriding costUsd.

category is one of "work" (default), "idle", or "overhead". It controls which bucket the cost appears in on /api/costs/summary.

Three Bugs Fixed in #148

Bug 1 — Endpoint disagreement

Before: /api/usage read from an all-time Marten projection; /api/costs/summary queried raw events filtered by the days window. Same agent, different totals.

After: /api/usage?days=N queries the same raw event stream as /api/costs/summary?days=N. Both produce identical totals for the same window.

Bug 2 — Cache tokens ignored

Before: Agent-reported costUsd excluded cache token costs. A review with 160 k cache-read tokens was reported as $0.24 when the true cost was ~$0.56.

After: When cacheReadTokens > 0 || cacheCreationTokens > 0, the conductor recomputes cost using AnthropicPricing.ComputeCost, which adds cache pricing on top of input/output.

Bug 3 — Phantom idle cost

Before: TOKEN_USAGE events with inputTokens=0, outputTokens=0 but non-zero costUsd (heartbeat overhead attributed to agents) contributed to idleCostUsd.

After: Any event where all four token counts are zero contributes $0, regardless of the reported costUsd.

Adding a New Model

Edit AnthropicPricing.cs and add an entry to the Prices dictionary:

["claude-new-model-20270101"] = new(inputPer1M, outputPer1M, cacheReadPer1M, cacheCreatePer1M),

The Resolve method will also match on prefix, so "claude-new-model" is automatically covered.