Skip to content

Agent Secrets Broker — Autonomous Secret Lifecycle for LLM Agents

TL;DR

LLM agents that handle secrets via prompt, tool argument, or log entry are compromised by design — prompt-injection, transcript storage, and log shipping all become exfiltration vectors. The secrets broker applies capability-based mediation (Hardy 1988; Miller 2006): the LLM is the planner that operates on opaque references (bw:item/prod-db-password); the broker is the courier that handles plaintext. The agent never sees the bytes. This document specifies the tool surface, destination grammar, policy model, audit schema, and a 4-week implementation plan. Does not duplicate security.md (supply-chain: Sigstore/SLSA/Kyverno) — covers the complementary runtime-lifecycle layer.

Motivation

Anthropic's April 2026 third-party-tool policy change metered subscription-OAuth usage, pushing more agentic work onto hybrid local/cloud inference where the rig cannot assume that transcript storage is under Dashecorp's control. Three failure modes drove this design:

Failure mode Mechanism Risk
Prompt-in-plaintext Agent receives API_KEY=sk-abc123 in tool output; key is logged to transcript Full compromise on transcript exfil
Tool-argument leak deploy_secret(value="sk-abc123") appears in structured tool call log Log aggregation → attacker
Rotation paralysis No agent-driven rotation path; secrets age indefinitely Long exposure window on compromise

The broker pattern eliminates all three: the agent requests operations by reference, the broker executes them against the backing store, and plaintext never crosses the LLM boundary.

Complementary scope: security.md covers the supply-chain layer (Sigstore, SLSA, Kyverno admission). This whitepaper covers the runtime lifecycle layer — what happens after a container is admitted and needs a secret to function.

The threat model

graph TB
    classDef threat fill:#ffcccc,color:#000
    classDef defense fill:#c8e6c9,color:#000
    classDef neutral fill:#e3f2fd,color:#000

    LLM[LLM Agent]:::neutral
    BROKER[Secrets Broker]:::defense
    BW[Bitwarden]:::neutral
    GH[GitHub Secrets]:::neutral
    SOPS[SOPS / age]:::neutral
    K8S[Kubernetes Secrets]:::neutral
    CF[Cloudflare Worker Secrets]:::neutral

    T1[Prompt injection exfil]:::threat
    T2[Transcript storage leak]:::threat
    T3[Log line leak]:::threat
    T4[Over-privileged rotation]:::threat
    T5[Unaudited secret access]:::threat
    T6[Hardware key bypass]:::threat

    D1[Reference-only API — no plaintext crosses LLM boundary]:::defense
    D2[Transcript-safe tool surface — values never in args or output]:::defense
    D3[Structured log sanitisation — broker logs ref not value]:::defense
    D4[Policy model — per-secret rotation scope + rate limits]:::defense
    D5[Append-only audit log — SQLite + R2 mirror]:::defense
    D6[hardware_key_required flag — policy blocks software-only rotation]:::defense

    T1 -.->|blocked by| D1
    T2 -.->|blocked by| D2
    T3 -.->|blocked by| D3
    T4 -.->|blocked by| D4
    T5 -.->|blocked by| D5
    T6 -.->|blocked by| D6

    LLM -->|ref only| BROKER
    BROKER -->|plaintext| BW
    BROKER -->|plaintext| GH
    BROKER -->|plaintext| SOPS
    BROKER -->|plaintext| K8S
    BROKER -->|plaintext| CF

The LLM → Broker boundary is the invariant: the arrow carries only references and operation names. Plaintext flows only within the broker process and onward to backing stores over authenticated, encrypted channels.

Secret-kind taxonomy

Not all secrets are equal. The broker distinguishes automatable secrets from human-bootstrap secrets:

Kind Examples Automatable? Notes
api-key Anthropic, GitHub PAT, Cloudflare token Yes (generate + deploy) Provider must support programmatic issuance
symmetric-key SOPS age recipient, AES-256 data key Yes generate_and_deploy flow
db-password Postgres service account Yes Must rotate with zero-downtime (dual-write period)
jwt-signing-key RS256/ES256 private key Yes Key rotation requires public-key republish
tls-cert Cluster internal CA Delegated to cert-manager Broker tracks ref; cert-manager issues
oauth-client-secret GitHub App, Google OAuth Human-bootstrap Provider issues interactively; agent stores result
hardware-backed YubiKey PIV, HSM-resident Human-bootstrap always hardware_key_required: true policy flag; broker refuses software rotation
biometric Touch ID, passkeys Human-bootstrap always Never enters the broker at any stage

Automatable secrets complete the full mint → store → deploy → rotate → retire cycle without human intervention.

Human-bootstrap secrets require a human to perform initial issuance; the broker takes over for storage, deployment, and lifecycle tracking once the human has deposited the value via an authenticated, out-of-band channel (never via agent prompt).

Destination reference grammar

The broker uses a URI-like reference grammar for all secret locations. References are the only values that cross the LLM boundary.

<scheme>:<path>[?<params>]
Scheme Backing store Example
bw: Bitwarden (personal or org vault) bw:item/prod-db-password
gh: GitHub repository secret gh:dashecorp/rig-conductor/PROD_DB_PASSWORD
gh-env: GitHub environment secret gh-env:dashecorp/rig-conductor/production/PROD_DB_PASSWORD
sops: SOPS-encrypted file at path, key name sops:apps/rig-conductor/secrets.sops.yaml#DB_PASSWORD
k8s: Kubernetes Secret, namespace/name/key k8s:rig-conductor/prod-secrets#db-password
cf-worker: Cloudflare Worker secret cf-worker:rig-conductor-api/DB_PASSWORD

Refs are stable across rotations — the broker updates the backing store value; callers holding the ref do not need to change.

Resolution rules: - Refs are validated against the policy registry on every operation. - Unknown or malformed refs are rejected before any backing store call. - Cross-destination copy (e.g., mint to bw:, deploy to gh: and k8s:) is a single atomic broker operation, not two agent calls.

Tool surface

The broker exposes eight tools to the LLM. None accept or return plaintext. All operations are synchronous unless noted.

Tool Args Returns Effect
mint kind, ref, policy_ref {ref, created_at} Generate new secret value, store at ref, register policy
store ref, policy_ref {ref, stored_at} Deposit a value the human has provided out-of-band; agent provides only the ref
deploy src_ref, dst_refs[] {deployed_to[], skipped[]} Copy from source ref to one or more destinations
rotate ref, strategy {ref, rotated_at, old_version} Generate new value, dual-write if strategy=zero-downtime, retire old
retire ref {ref, retired_at} Revoke and delete from all destinations; purge backing store
verify ref {valid: bool, destinations[]} Check that the ref exists, is not expired, and all declared destinations are in sync
list filter {refs[]} List refs matching filter; returns refs only, never values
generate_and_deploy kind, dst_refs[], policy_ref {ref, deployed_to[]} Mint + deploy in one call; common shorthand for new-secret flows

Tool call examples (what the LLM sees)

// Agent calls generate_and_deploy for a new Cloudflare token
{
  "tool": "generate_and_deploy",
  "args": {
    "kind": "api-key",
    "dst_refs": ["cf-worker:rig-conductor-api/CF_API_TOKEN", "bw:item/rig-cf-api-token"],
    "policy_ref": "policy:cloudflare-api-key-standard"
  }
}

// Broker returns — no plaintext
{
  "ref": "bw:item/rig-cf-api-token",
  "deployed_to": ["cf-worker:rig-conductor-api/CF_API_TOKEN", "bw:item/rig-cf-api-token"],
  "deployed_at": "2026-04-22T10:15:00Z"
}

The broker's logs record ref and operation, never value.

What the tool surface deliberately omits

  • read — no tool to retrieve a plaintext value. Backing stores expose their native fetch path directly to the consuming process (e.g., Kubernetes mounts the Secret as a volume; the agent pod reads the file). The broker is not in the read path.
  • patch — no partial update. Rotation replaces atomically.
  • impersonate — no tool to operate as a different principal. The broker's identity is fixed per deployment.

Policy model

Each ref is bound to a policy entry at creation time. Policies are stored in policy/secrets/<name>.yaml in the rig-gitops repo, version-controlled, and loaded into the broker at startup.

# policy/secrets/cloudflare-api-key-standard.yaml
apiVersion: secrets.rig.dashecorp.com/v1
kind: SecretPolicy
metadata:
  name: cloudflare-api-key-standard
spec:
  kind: api-key
  max_age_days: 90          # broker emits rotation alert after 90 days
  auto_rotate: true         # broker schedules rotation without agent prompt
  rotation_strategy: immediate  # no dual-write needed; Cloudflare invalidates old instantly
  rate_limit:
    max_rotations_per_day: 3    # prevent runaway rotation loops
  allowed_destinations:
    - cf-worker:*               # wildcards allowed within scheme
    - bw:item/*
  hardware_key_required: false  # software rotation is permitted

---
# policy/secrets/prod-tls-ca.yaml — hardware-backed example
apiVersion: secrets.rig.dashecorp.com/v1
kind: SecretPolicy
metadata:
  name: prod-tls-ca
spec:
  kind: tls-cert
  max_age_days: 365
  auto_rotate: false
  hardware_key_required: true   # broker REFUSES software rotation; emits HumanRequired event
  human_escalation_channel: "#admin"
  allowed_destinations:
    - k8s:cert-manager/*

Hardware-key override: when hardware_key_required: true, the broker: 1. Refuses any rotate or mint call for that ref. 2. Emits a HumanRequired event to rig-conductor, which routes to #admin. 3. Accepts a store call once the human has performed the hardware-backed issuance out-of-band.

Policy changes go through a PR; changes to policies covering T3 secrets require a human co-sign per trust-model.md.

Audit schema

Every broker operation is appended to an immutable audit log. No update or delete is possible on the log itself.

SQLite schema (primary, local to broker pod)

CREATE TABLE audit_log (
    id          INTEGER PRIMARY KEY AUTOINCREMENT,
    ts          TEXT    NOT NULL,          -- ISO 8601, microseconds
    agent_id    TEXT    NOT NULL,          -- e.g. "dev-e"
    operation   TEXT    NOT NULL,          -- mint|store|deploy|rotate|retire|verify|list|generate_and_deploy
    ref         TEXT    NOT NULL,          -- the target ref (never the value)
    dst_refs    TEXT,                      -- JSON array for deploy/generate_and_deploy
    policy_ref  TEXT,
    outcome     TEXT    NOT NULL,          -- ok|rejected|escalated
    reject_reason TEXT,                    -- populated on rejected/escalated
    duration_ms INTEGER NOT NULL,
    CONSTRAINT no_update CHECK (TRUE)      -- enforced at application layer; SQLite has no DDL lock
) STRICT;

CREATE INDEX idx_audit_ref_ts ON audit_log (ref, ts);
CREATE INDEX idx_audit_agent_ts ON audit_log (agent_id, ts);

Append-only enforcement: the broker process opens the database with PRAGMA journal_mode=WAL and exposes no SQL connection to external callers. The schema has no DELETE or UPDATE code paths.

R2 mirror (durable, cross-region)

Every row is streamed to Cloudflare R2 in NDJSON format within 30 seconds of append. The R2 bucket has:

  • Object Lock (WORM) — objects are immutable for 7 years (configurable per compliance requirement).
  • Public access: disabled — audit reads require a signed URL issued by the broker's read-only audit endpoint.
  • Replication: standard R2 cross-region replication.

Mirror lag alert: if the broker's R2 flush lag exceeds 60 seconds, audit_mirror_lag_seconds Prometheus metric fires an alert and the broker continues but logs a warning. The SQLite log remains authoritative until mirror catches up.

Querying the audit log

-- All rotations for a ref in the last 30 days
SELECT ts, agent_id, outcome, reject_reason
FROM audit_log
WHERE ref = 'bw:item/prod-db-password'
  AND operation = 'rotate'
  AND ts > datetime('now', '-30 days')
ORDER BY ts DESC;

-- Rejected operations (policy violations)
SELECT ts, agent_id, ref, operation, reject_reason
FROM audit_log
WHERE outcome = 'rejected'
ORDER BY ts DESC
LIMIT 100;

User stories

US-1 — New deployment, new API key

As Dev-E provisioning a new Cloudflare Worker, I want to mint a new Cloudflare API token and deploy it to both the Worker and Bitwarden in one call, so that I never handle the token value and can hand the ref to the next deployment step.

Acceptance criteria: - generate_and_deploy(kind="api-key", dst_refs=["cf-worker:worker-name/CF_API_TOKEN", "bw:item/worker-name-cf-token"], policy_ref="policy:cloudflare-api-key-standard") succeeds. - Broker creates a token via Cloudflare API, stores it, and returns {ref, deployed_to} with no plaintext. - Audit log records the operation with outcome=ok. - The Worker can call its bound API using the new token within 5 seconds of the call returning.

US-2 — Scheduled rotation

As the rig's rotation scheduler, I want to rotate all secrets that exceed their max_age_days policy threshold, so that secret age never exceeds policy limits without human involvement.

Acceptance criteria: - A cron job calls list(filter={overdue_rotation: true}), then rotate(ref, strategy) for each returned ref. - Rotations complete for all auto_rotate: true secrets without agent prompting. - Secrets with hardware_key_required: true emit HumanRequired events instead of rotating. - Rate limit (max_rotations_per_day) is enforced: excess calls return {outcome: rejected, reject_reason: "rate_limit"}.

US-3 — Zero-downtime database password rotation

As the rig rotating the Postgres service-account password, I want the broker to use a dual-write strategy so no live connection is dropped, so that rig-conductor's connection pool continues without disruption.

Acceptance criteria: - rotate(ref="k8s:rig-conductor/prod-secrets#db-password", strategy="zero-downtime") executes the sequence: generate new value → deploy new value alongside old → wait for connection drain (configurable, default 30s) → retire old value. - No 5xx errors from rig-conductor during the rotation window. - Audit log records old_version ref alongside the new rotation event.

US-4 — Hardware-backed secret, human bootstrap

As a human operator provisioning the cluster's internal CA, I want to use my YubiKey to sign the CA key and deposit the result via the broker's out-of-band store endpoint, so that the broker tracks the cert lifecycle but the key material never passes through software-only paths.

Acceptance criteria: - store(ref="k8s:cert-manager/internal-ca#tls.key", policy_ref="policy:prod-tls-ca") accepts the human's deposit. - The broker verifies the calling principal is human (via OIDC, not agent identity) before accepting the store call. - Any subsequent agent call to rotate(ref=...) is rejected with outcome=escalated and reason hardware_key_required. - #admin receives a Discord notification with the ref and instructions.

US-5 — Secret retirement after service decommission

As Dev-E decommissioning a deprecated microservice, I want to retire all secrets associated with the service ref pattern, so that orphaned credentials cannot be abused after the service is removed.

Acceptance criteria: - list(filter={prefix: "k8s:legacy-service/"}) returns all refs for the service. - retire(ref) for each ref revokes the credential at the backing store (Cloudflare API, GitHub API, SOPS file update) and records retired_at in the audit log. - After retirement, verify(ref) returns {valid: false} for each retired ref. - Retired refs remain in the audit log permanently (they are never deleted).

US-6 — Cross-destination sync verification

As the rig's weekly integrity check job, I want to verify that all refs are present and in sync across all declared destinations, so that drift (e.g., a k8s secret manually overwritten) is surfaced before it causes an incident.

Acceptance criteria: - verify(ref) checks existence and hash-match across all destinations in the policy's allowed_destinations. - Mismatched destinations return {valid: false, destinations: [{ref, status: "drift"}]}. - Drift events are recorded in the audit log and fire a Prometheus alert secrets_destination_drift_total.

Implementation plan — 4 weeks

Week Deliverable Key tasks
W1 Core broker service Go/Rust binary; ref parser; policy loader; in-memory operation dispatch; SQLite audit schema; unit tests for all 8 tools
W2 Backing store adapters Bitwarden SDK adapter; GitHub Secrets API adapter; SOPS file adapter; Kubernetes Secret adapter; Cloudflare API adapter; integration tests per adapter
W3 Policy engine + R2 audit mirror Policy YAML loader + validator; hardware_key_required escalation path; rate-limit enforcement; R2 NDJSON flush; WORM bucket config; secrets_destination_drift_total metric
W4 Rotation scheduler + hardening Cron-triggered rotation loop; zero-downtime dual-write strategy; Cilium egress policy for broker pod; OIDC-based store human-principal verification; end-to-end smoke test; docs

Not in scope for v1: multi-cluster federation, secret sharing across Dashecorp org boundaries, dynamic Vault/OpenBao integration (see security.md trigger list for when that changes), and agent-to-agent secret delegation.

Deployment topology

graph LR
    AGENT[LLM Agent Pod\ndev-e namespace]
    BROKER[Secrets Broker Pod\nsecrets-broker namespace]
    SQLITE[(SQLite WAL\nPVC - local)]
    R2[(Cloudflare R2\nWORM bucket)]
    BW[Bitwarden]
    GH[GitHub API]
    SOPSREPO[SOPS git repo]
    K8SAPI[Kubernetes API]
    CF[Cloudflare API]

    AGENT -->|mTLS, ref only| BROKER
    BROKER --> SQLITE
    BROKER -->|async flush| R2
    BROKER -->|HTTPS| BW
    BROKER -->|HTTPS| GH
    BROKER -->|HTTPS| SOPSREPO
    BROKER -->|in-cluster| K8SAPI
    BROKER -->|HTTPS| CF

The broker pod runs in a dedicated secrets-broker namespace with its own Cilium egress policy covering only the five backing-store endpoints. Agent pods reach the broker via mTLS (cert-manager Certificate). No agent pod has direct egress to any backing store.

Residual risks (honest assessment)

The broker pattern significantly raises the bar but does not eliminate all risk.

Risk Likelihood Severity Mitigation Residual
Broker process memory scrape Low Critical Run broker as non-root, no ptrace, pod security baseline; encrypt in-memory value buffers Low — requires node compromise
Broker pod compromise via supply chain Low Critical Broker image signed + SLSA L3; Kyverno admission required; see security.md Low — in-depth chain
OIDC token replay for store endpoint Medium High Short-lived Fulcio certs (10 min); Rekor log checked for replay Low after mitigation
Policy misconfiguration (too-permissive allowed_destinations) Medium High Policy PRs require human review; CI schema-validates policy YAML Medium — human error in policy authoring
R2 mirror outage Low Medium SQLite remains authoritative; broker continues; mirror lag alert fires; manual re-sync on recovery Low — audit continuity preserved
Backing store API rate limits blocking rotation Medium Medium Rate-limit policy per secret; rotation scheduler backs off exponentially Low after mitigation
LLM hallucinates a valid-looking ref for a secret it shouldn't access Medium High Ref registry validates every ref against policy on every call; hallucinated refs fail at validation Low — gated by registry
Dual-write window for zero-downtime rotation Low Medium Window is configurable; minimum 30s; connection draining is monitored Low — narrow window

Structural gap: the broker cannot protect against a compromised backing store (e.g., a Bitwarden breach). This is out-of-scope for the broker layer and addressed by backing-store selection, credential isolation, and key hierarchies. See limitations.md for the rig's general stance on out-of-scope mitigations.

Relationship to existing security controls

The broker integrates with, not replaces, existing controls:

Existing control How the broker uses it
Cilium L7 egress Broker namespace has its own per-backing-store allowlist; agent pods have no direct backing-store egress
SOPS + age Broker is the authorized mutator for SOPS files; no other process writes to encrypted manifests
Kyverno admission Broker image must pass standard signed-image policy before admission
Gitsign on agent commits Policy YAML changes committed by agents are signed; broker reloads policy only from verified git refs
Trust model tiers Secret rotation of T3 secrets (auth, payment credentials) is hardware_key_required: true by policy
Audit log (observability.md) Broker audit events feed the same observability stack; secrets_rotation_total, secrets_drift_total added to the cost dashboard

See also

  • index.md — whitepaper master
  • security.md — supply-chain layer (Sigstore, SLSA, Kyverno); complementary, not duplicated here
  • trust-model.md — tier classification for T3 secret operations; hardware-key secrets are always T3
  • safety.md — prompt-injection defenses that motivate the reference-only LLM boundary
  • observability.md — how broker metrics surface in the cost dashboard
  • limitations.md — what the broker does not cover (backing-store breaches, biometric secrets)
  • docs/sops.md — SOPS operational mechanics; the broker delegates git-encrypted secret mutations here