Implementation Status — What's Deployed vs. Planned¶

Why this doc exists

The whitepaper describes a target architecture with 16 companion docs and hundreds of capabilities across safety, security, observability, self-healing, memory, and more. Without a single source of truth for "what's real vs. planned", readers (including future-you) have to piece it together from retraction callouts, "TL;DR — deployed but largely unexercised" admonitions, and context. This doc is the flat answer: every named capability, its current status, a link to its whitepaper section, and a link to the tracking ticket (if any).

Status categories¶

Status	Meaning
Deployed	Working in production today. Verified by repo inspection, cluster state, or smoke test.
Partial	Deployed but with a named gap (e.g., memory pipeline is live but SAVE doesn't work). Honestly acknowledged in the linked whitepaper section.
Planned	Has a concrete ticket or pair-mode session scheduled. Known scope.
Deferred	Named in the whitepaper as wanted-eventually but not scheduled. Trigger-based (e.g., "adopt when X happens").
Rejected	Considered, documented rationale, explicitly not doing. Lives in the tool-choices ADR rejection list.

Summary (as of 2026-04-21)¶

pie title Capabilities by status
    "Deployed" : 17
    "Partial" : 7
    "Planned" : 32
    "Deferred" : 9
    "Rejected" : 13

Total tracked: 78 capabilities across 11 domains. Deployed + partial: 24 (31%). Planned + deferred: 41 (53%). The whitepaper is a multi-month-to-year vision, not a today snapshot. Partial-status rows are the honest-gap flags — they exist because reality isn't as tidy as the design.

2026-04-22 update: Priority 1 AC 1–4 shipped 2026-04-20/21. AC 5 Phase 1 (default-deny egress NetworkPolicy + CIDR allowlist) was shipped and reverted within hours on 2026-04-22 morning — an end-to-end test (rig-docs #97) revealed api.anthropic.com is Cloudflare-fronted so the ipBlock approach cannot work. Afternoon same day: three spikes produced the redesign. LiteLLM path deferred to Priority 3 (error wrapping + OAuth incompatibility — see spike #1, spike #2). Envoy SNI egress gateway verified end-to-end (spike) and deployed to the cluster (rig-gitops #153, still live and healthy). Two same-day client-integration attempts rolled back: HTTPS_PROXY env var (#154 → #155) because SNI-inspector doesn't speak HTTP CONNECT; cluster-wide CoreDNS rewrite (#156 → #158) because it caught Flux's own github.com fetch.

2026-04-22 evening update: Pod-scoped DNS path taken. Chart pass-through for dnsPolicy / dnsConfig landed in dashecorp/rig-agent-runtime (chart 1.1.0). Dedicated CoreDNS shipped in egress-gw namespace with rewrite rules for every allowlisted public host → envoy.egress-gw.svc.cluster.local (no hardcoded Envoy IP — survives service recreation). review-e wired first; 24h burn-in; then dev-e; then the default-deny NetworkPolicy (must allow Postgres 5432 to rig-conductor, not just 8080/6379 — gap from the first spike).

By domain¶

Each table: capability → current status → whitepaper section → ticket / PR / repo evidence → notes.

Coordination (rig-conductor)¶

Capability	Status	Whitepaper section	Evidence	Notes
rig-conductor event store (Marten/Postgres)	Deployed	architecture-current.md	`rig-conductor` namespace live	29 event types defined (latest: `GUARD_BLOCKED`, PR #90), projections live. Public catalog at `/events.md`.
POST `/api/events` endpoint	Deployed	architecture-current.md	`dashecorp/rig-conductor`	Production-active
Assignment dispatch (`GET /api/assignments/next`)	Deployed	trust-model.md	Source in `MartenEventStore`	Priority + FIFO only, no capacity check
Review claim endpoint (`GET /api/reviews/next`)	Deployed	—	Verified exists at `Program.cs:804`	README was stale about this
Per-consumer cursor projection	Planned	architecture-proposed-v2.md	Phase 3	Replaces the earlier "per-pod capacity" framing
Agent subscription registry (YAML in rig-gitops)	Planned	architecture-proposed-v2.md	Phase 3	Topology validation at deploy time
Bounded-loop sentinel (`ReviewLoopExceeded`)	Planned	architecture-proposed-v2.md	Phase 4	Caps Dev-E/Review-E ping-pong
Escalation severity routing + `StaleHeartbeatService`	Planned	architecture-proposed-v2.md + self-healing.md	Phase 4
Error budget projection	Planned	observability.md, self-healing.md	Phase 5
Attestation projection	Planned	security.md	Phase 5	Per-change cryptographic chain materialized

Agent execution¶

Capability	Status	Whitepaper section	Evidence	Notes
Dev-E (Node variant, active)	Deployed	architecture-current.md	`apps/dev-e/rig-agent-helmrelease.yaml`	Primary runtime, cron-dispatched every 5 min
Dev-E (dotnet variant)	Partial	architecture-current.md	HelmRelease exists, `cron.enabled: false`	Functionally dormant. See cleanup recommendation.
Dev-E (python variant)	Partial	architecture-current.md	HelmRelease exists, likely dormant like dotnet	Same shape as dotnet variant
Review-E	Deployed	architecture-current.md	`apps/review-e/rig-agent-helmrelease.yaml`
Spec-E (intake refiner)	Planned	trust-model.md, development-process.md	Phase 7	Clarifier gate at issue intake
Architect-E (interface shaper)	Planned	trust-model.md	Phase 7	High bar role for T2 interface design
Dev-E repair-dispatch mode	Planned	self-healing.md	Phase 7	Not a separate agent — a Dev-E dispatch mode

Safety¶

Capability	Status	Whitepaper section	Evidence	Notes
Dangerous-command guard	Deployed	safety.md, example-first-story.md	`rig-agent-runtime/hooks/pretool-guard.sh`; PRs #97, #98, #99	Shipped 2026-04-20. 43 test cases. Activated by default via baked-in `~/.claude/settings.json`. Blocks sudo / rm -rf system paths / git push --force / git reset --hard / destructive SQL / cluster-scope kubectl delete / package installers / chmod 777 / curl\|sh. No override flag.
Git worktrees per agent task	Deployed	architecture-proposed-v2.md	`rig-agent-runtime/hooks/task-workspace.sh`; PR #101	Shipped 2026-04-21. Bare clone at `_bare/<owner>/<repo>.git` reused across tasks; worktree at `tasks/<task-id>/<repo>/` per task. 17 test cases. Task prompt in stream-consumer.js now uses `task-workspace create`.
Default-deny egress NetworkPolicy	Partial	safety.md, security.md	Envoy gateway: rig-gitops #153. Chart DNS pass-through: rig-agent-runtime #115. Pod-scoped DNS + review-e wire: rig-gitops (this PR).	Phase 1 path C (pod-scoped DNS). Dedicated CoreDNS in `egress-gw` rewrites allowlisted public hostnames → `envoy.egress-gw.svc.cluster.local` (no hardcoded IP — rewrite resolves via real kube-dns, survives Envoy Service recreation). review-e uses `dnsPolicy: None` + `dnsConfig.nameservers: [10.43.200.53, 10.43.0.10]`. Burn-in before expanding to dev-e and locking egress with NetworkPolicy (which must include Postgres 5432 → rig-conductor, not only 8080/6379 — the first spike missed that).
Hook reliability spool	Planned	architecture-proposed-v2.md	Phase 1	At-least-once event delivery
StuckGuard middleware (5 patterns)	Planned	safety.md, architecture-proposed-v2.md	Phase 2	OpenHands + Goose + Sweep convergence
Human Prime SessionStart hook	Planned	architecture-proposed-v2.md	Phase 1	For humans using Claude Code locally
CaMeL trust separation (privileged + quarantined)	Planned	safety.md	Phase 6	Only prompt-injection defense with a formal guarantee
Schema-validated tool use (Pydantic / Instructor)	Planned	safety.md	—	Not yet tied to a phase; continuous as tools are added

Security (supply chain + runtime)¶

Capability	Status	Whitepaper section	Evidence	Notes
SOPS + age + Flux inline decryption	Deployed	security.md, docs/sops.md	`.sops.yaml` at repo root + every Kustomization uses `decryption.provider: sops`	The right answer all along (verified via three rounds of retraction)
GitHub App installation tokens (1h TTL)	Deployed	security.md	`rig-agent-runtime/src/github-token.js`; rig-gitops PR #119; rig-agent-runtime PR #103	Shipped 2026-04-21. 1h installation tokens minted from App PEM, refreshed every 50 min. No PAT fallback when App mint fails (fail loud). `GITHUB_PERSONAL_ACCESS_TOKEN` env var removed from dev-e and review-e pods; SealedSecret key still present, prune at next rotation.
Sigstore image signing (cosign, keyless)	Planned	security.md	Phase 4
SLSA v1.0 L3 build provenance	Planned	security.md	Phase 4	Via `slsa-framework/slsa-github-generator`
Gitsign commit signing (agent commits)	Planned	security.md	Phase 4	Out-of-band CI verification, GitHub "Verified" gotcha documented
Kyverno admission policies	Planned	security.md, trust-model.md	Phase 4	Native Sigstore verification
Two-attestor T3 Kyverno policy	Planned	security.md, limitations.md	Phase 4	Structural limit on 1-person rigs acknowledged
Hostname egress allowlist (Phase 2)	Planned	security.md, safety.md	Pending research	Phase 1 attempted + reverted 2026-04-22 (ipBlock can't allowlist Cloudflare-fronted api.anthropic.com). Phase 2 will likely land as part of Priority 3 (LiteLLM proxy + egress gateway).
cert-manager + trust-manager	Planned	security.md, tool-choices.md	—	Table stakes, non-controversial
Bitwarden human vault	Deployed	tool-choices.md	In team workflow
Mandatory 2FA on GitHub	Deployed	—	Organization policy	Not in whitepaper but worth tracking

Observability¶

Capability	Status	Whitepaper section	Evidence	Notes
OpenTelemetry Collector	Partial	observability.md	Deployed for rig-conductor	Agents not yet emitting OTel GenAI spans
Claude Code native OTel emission	Planned	observability.md, provider-portability.md	Phase 2	Set `CLAUDE_CODE_ENABLE_TELEMETRY=1` in agent pods
Langfuse self-hosted (or Phoenix on 8GB VM)	Planned	observability.md	Phase 2	Conditional on VM size — Phoenix if we stay on 8GB
Local Prometheus	Partial	observability.md	kube-prometheus-stack deployed	Not yet source of truth for Flagger gates (Flagger not deployed yet)
Grafana Cloud Free ingest	Planned	observability.md	Phase 2	OTel Collector → managed
SLO burn-rate alerts (Honeycomb pattern)	Planned	observability.md, self-healing.md	Phase 5
Cost dashboard (per-agent, per-task)	Partial	observability.md, cost-framework.md	Basic cost tracking exists (`TokenUsageProjection`)	No LiteLLM proxy yet, so no hard enforcement

Cost framework¶

Capability	Status	Whitepaper section	Evidence	Notes
`TokenUsage` event + projection	Deployed	cost-framework.md	`src/ConductorE.Api/Adapters/MartenProjections.cs`	Aggregates per agent × repo
LiteLLM proxy	Planned	cost-framework.md, tool-choices.md	Phase 2	Hard ceiling for per-key budgets
Per-agent virtual keys + budget caps	Planned	cost-framework.md	Phase 2	Depends on LiteLLM proxy
Pre-flight cost prediction (cheap model)	Planned	cost-framework.md	Phase 2	Haiku or local Ollama for estimation
Circuit breaker on 529 storms	Planned	cost-framework.md	Phase 2
Prompt caching (stable system prompts)	Planned	cost-framework.md	Phase 2	Claude Code does this automatically
Cross-provider fallback routing (LiteLLM `fallback_models`)	Deferred	provider-portability.md	—	Adopt when we have multiple providers configured

Self-healing¶

Capability	Status	Whitepaper section	Evidence	Notes
Flagger canary deploys	Planned	self-healing.md	Phase 5	Flux-native progressive delivery
flagd + OpenFeature kill switches	Deferred	self-healing.md, tool-choices.md	—	YAGNI — env vars + Kustomize cover today
pgroll expand/contract migrations	Planned	self-healing.md, tool-choices.md	Phase 5	With inspectable SQL trail hedge
Reproduction harness (ephemeral namespace)	Planned	self-healing.md	Phase 5 (Stage 2)	Frontier work, honest
Repair-dispatch Dev-E mode	Planned	self-healing.md	Phase 5 (Stage 2)	Confidence thresholds are calibration-gated
Kill-switch → rollback → forward-fix priority order	Planned	self-healing.md, principles.md (principle 3)	Phase 5	Principle says reversible before irreversible
Post-incident learning loop	Planned	self-healing.md	Phase 5 (Stage 4, aspirational)

Quality and evaluation¶

Capability	Status	Whitepaper section	Evidence	Notes
Nightly golden suite + regression cases	Planned	quality-and-evaluation.md	Phase 2	~$3-8/night, the regression gate
Weekly SWE-bench Pro subset	Planned	quality-and-evaluation.md	Phase 2	~$20-40/week, trend line
Quarterly LiveCodeBench	Deferred	quality-and-evaluation.md	—	Cut first if budget tightens
Property-based test generation (Hypothesis)	Planned	quality-and-evaluation.md, safety.md	Phase 2	Label-gated, not every PR
LLM-as-judge sampling (10% T0, 100% T2)	Planned	quality-and-evaluation.md	Phase 2
DORA metrics adapted to agents	Planned	quality-and-evaluation.md	Phase 2	Lead time, CFR, rework rate, rollback rate
Inspect AI (UK AISI)	Deferred	tool-choices.md	—	Emerging pick, validate in Era 2

Drift detection¶

Capability	Status	Whitepaper section	Evidence	Notes
Model drift: 20-prompt canary suite	Planned	drift-detection.md	Phase 6	Per-provider
Prompt drift: golden-suite regression on prompt changes	Planned	drift-detection.md, quality-and-evaluation.md	Phase 2	Blocks merge on regression
Code drift: Flux reconciliation events	Partial	drift-detection.md	Flux detects, not yet alerted-on
Config drift: Flux + kube-diff	Partial	drift-detection.md	Flux detects
Kyverno policy drift detector	Planned	drift-detection.md	Phase 4	P0/P1 alerts for T3 policies
Memory drift: repeat-query canary	Deferred	memory.md	—	Fifth channel, not yet in drift-detection.md

Memory¶

Capability	Status	Whitepaper section	Evidence	Notes
Postgres + pgvector storage	Deployed	memory.md	`rig-conductor/postgres-0`	Co-located with Marten
HNSW + GIN indexes	Deployed	memory.md	Schema in `rig-memory-mcp/db.js`
OpenAI `text-embedding-3-small` embeddings (optional)	Deployed	memory.md	`OPENAI_API_KEY` injected	Silent fallback to BM25-only if missing
`search_memories` MCP tool	Deployed	memory.md	`rig-memory-mcp`	Hybrid vector + BM25
`write_memory` MCP tool	Partial	memory.md	Works when called	Agents rarely call it
`save_pattern` (auto via `### Learnings` scrape)	Partial	memory.md	Pipeline exists	Broken — agents don't emit the section
`mark_used` (hit counter)	Partial	memory.md	Tool exists	Agents don't call it — metric is 0%
`compact_repo`	Partial	memory.md	Tool exists	No cron triggers it
Session-start memory LOAD	Deployed	memory.md	`[Stream] Loaded memory for <repo>` in logs
4-tier scope enforcement (session/task/repo/global)	Planned	memory.md	—	Aspirational today; soft-tagging in practice
`hit_used` real metric (citation-enforced or LLM-judge)	Planned	memory.md	—	Current metric is fiction
Advisor handoff protocol	Deployed	memory.md	PR #71	Prompt-level only, zero enforcement
Memory-write gate (validated writes + attestation)	Planned	memory.md (security section)	—	Memory poisoning defense
Memory TTL pruning cron	Planned	memory.md	—	`expires_at` column exists, no job
Memory compaction cron	Planned	memory.md	—	`compact_repo` exists, no trigger

Cluster and runtime¶

Capability	Status	Whitepaper section	Evidence	Notes
k3s on single GCP VM (8 GB)	Deployed	architecture-current.md, tool-choices.md	`invotek-k3s`
KEDA event-driven autoscaling	Deployed	architecture-current.md	ScaledObject per agent
FluxCD GitOps	Deployed	architecture-current.md	rig-gitops → cluster
GitHub Actions + GHCR	Deployed	—	Per-repo CI, images published
Cloudflare Tunnel (rig-conductor.dashecorp.com)	Deployed	architecture-current.md	`apps/cloudflared/`
Discord agent channels + webhooks	Deployed	architecture-current.md	rig-conductor event listener posts
Tablez/rig-agent-runtime/Stig-Johnny residue cleanup	Planned	Cleanup — see tier-1 recommendation	—	Mentioned in earlier session feedback

Development process¶

Capability	Status	Whitepaper section	Evidence	Notes
AGENTS.md cross-tool standard	Deployed	provider-portability.md, architecture-current.md	All repos import from `rig-gitops/AGENTS.md`
TaskSpec format (YAML)	Planned	trust-model.md, development-process.md	—	Era 2
Spec Kit `.specify/` for multi-PR work	Deferred	development-process.md, provider-portability.md	—	Adopt for changes bigger than a single PR
Tier-classifier (T0/T1/T2/T3 policy engine)	Planned	trust-model.md	—	`policy/blast-radius.yaml`
Autonomy tier promotion projection	Planned	trust-model.md, quality-and-evaluation.md	—	20 successful runs, zero rollbacks pattern
Weekly 30-min quality review ritual	Planned	development-process.md	—	Most load-bearing ritual
Process SLOs (PR lead time, rework rate, etc.)	Planned	development-process.md	—	Including T2 approval turnaround
Mermaid CI check on every PR touching .md	Deployed	—	`.github/workflows/mermaid-check.yml`	Shipped in rig-gitops#54

Rejected (explicitly considered and not pursuing)¶

Capability	Source of rejection	Why
HashiCorp Vault	tool-choices.md	BSL license, IBM ownership, 3-node HA operational cost; OpenBao is the fork we'd adopt if we ever needed Vault-class
SealedSecrets	tool-choices.md retraction log	Never deployed; SOPS was always the pick
Argo Rollouts (with Flux)	tool-choices.md, self-healing.md	Fights Flux field reconciliation; Flagger is Flux-native
Unleash (OSS)	tool-choices.md	Reached OSS EOL 2025-12-31
LaunchDarkly	tool-choices.md	SaaS-only, overkill for 1-2 person team
Doppler	tool-choices.md	SaaS-only, Doppler outage = our deploy outage
Keptn	tool-choices.md	CNCF-archived 2025-09-03
Reshape (Postgres migrations)	tool-choices.md	Single-author project, bus factor 1
microVMs (Firecracker/e2b/Daytona)	tool-choices.md	Wrong threat model for internal rig
Full self-hosted LGTM stack on 8GB	tool-choices.md, observability.md	Memory-starves the rig
HSM-backed PGP signing	tool-choices.md	Keyless Sigstore is better for our threat model
OPA Gatekeeper	tool-choices.md	Rego operational cost; Kyverno YAML wins at small team
Pr-workflow-guard (Gastown pattern)	architecture-proposed-v2.md	Blocks `gh pr create` — opposite of our PR-based model

How to keep this doc current¶

Today (v1 — manual):

On new ticket: add a row to the relevant domain table with Status: Planned and a link to the ticket
On PR merge: update the row — Planned → Deployed (or Partial if known gap). Update Evidence column with the merge commit SHA or live resource path
On retraction: update status. Add an honest note if the capability was reduced in scope
Weekly review ritual (development-process.md): include a 5-minute status-doc scan. Is anything stale? Any ticket that's moved past its status? Any capability without a row that should have one?
Monthly: validate Evidence column for 5 random deployed rows — kubectl get, grep the repo, or similar. Catches doc-vs-reality drift.

Proposed v2 automation (follow-up)¶

A GitHub Action that queries issues with whitepaper:* labels, aggregates by domain, regenerates the tables in this doc, commits back. Requirements:

Label convention: every issue implementing a whitepaper capability gets a label like whitepaper:safety/stuck-guard (domain / capability slug)
Front-matter source: a capabilities.yaml that lists all capabilities with their canonical whitepaper-section link, so rows persist even when no issue exists
Action: cron-weekly or on-issue-update, merges the live GitHub Issues state into the YAML, regenerates the markdown tables

Effort estimate: ~1-2 days. Worth it when this doc has accumulated enough content that manual maintenance starts slipping — probably after 40+ tracked capabilities reach Deployed/Partial status.

Proposed v3 (longer-term)¶

Live status from cluster state + rig-conductor event store, not from this doc. "Capability X is Deployed" verified by presence of the resource (HelmRelease exists, Kyverno policy applied, Flagger Canary running). Drift between claimed and actual state surfaces as an alert. This is the full-fidelity version; worth pursuing once the rig can reliably inspect itself.

What this doc is not¶

Not a roadmap. The roadmap lives in index.md's Phase 0-7 Gantt. This doc tracks status per capability; the roadmap sequences them.
Not a substitute for the whitepaper. Each row links to the authoritative whitepaper section. The status tells you whether it's real; the whitepaper tells you what it is and why.
Not a change log. Retractions and evolutions live in each doc's own retraction log (especially tool-choices.md). This doc is a snapshot of current reality.