Skip to content

Worked Example — Planning the First User Story

TL;DR

Everything in the whitepaper describes how work should be planned. This document is a concrete application of that process to the first user story we would implement: the dangerous-command guard. You'll see the TaskSpec, a five-issue split across three repos, the dependency graph, each issue's acceptance criteria, and the merge sequence. It's a template: every T1/T2 story should look roughly like this.

The chain up to this point: trust-model.md defines the tiers; development-process.md defines the process; tool-choices.md defines what to use. This document is the first application — what it actually looks like when the process is applied to one small story.

Choosing to worked-example the first-first story (rather than a hypothetical example) is deliberate: when this whitepaper is read six months from now, the reader can check the plan against what actually shipped.

The story

As a human or agent developer using Claude Code, I want a PreToolUse hook that rejects destructive shell commands before execution, so that confused or compromised sessions cannot cause irreversible damage to the filesystem, cluster, database, or source control.

Runtime portability

The story is written for Claude Code because that's the default runtime (Era 1 pair mode). The same PreToolUse hook pattern applies to Codex CLI and Gemini CLI — each has an equivalent hook surface (see provider-portability.md). The shell-level block logic itself is runtime-independent.

Why this one first

The rationale, evaluated against phase-0 alternatives:

Phase-0 item Bounded? Low blast radius? Immediate value? Pair-mode OK? Teaches reusable pattern?
Dangerous-command guard ✅ ~80 lines ✅ T1 ✅ from install onward ✅ hook scaffolding for StuckGuard, Prime, conductor-e-hook
Agent identity in git ✅ trivial ✅ T1 ⚠️ only for Era 2 agent commits
Default-deny egress NetworkPolicy ⚠️ needs Cilium L7 ⚠️ cluster-wide ✅ high ⚠️ ⚠️
Git worktrees ❌ deferred till scale >1
Hook reliability spool ⚠️ deferred till hooks widespread ⚠️

The guard is the only item that scores well on every axis. It also provides value in the session where it's written — the same pair-mode session implementing it can install it and immediately benefit.

The TaskSpec

This is what Spec-E would commit to Conductor-E in Era 2. In Era 1 (pair mode), the human writes it by hand. Full shape:

id: task-2026-04-guard-01
title: "Dangerous-command guard (phase 0 item 1)"
repo_primary: dashecorp/rig-tools
repos_touched:
  - dashecorp/rig-tools
  - dashecorp/conductor-e
  - dashecorp/rig-gitops
tier: T1
blast_radius:
  reason: "Single feature shipped across 3 repos, each repo's change is T1-bounded. Canary-gated at deploy."
  surfaces:
    - "rig-tools: hooks/dangerous-command-guard.sh, install.sh, tests/"
    - "conductor-e: src/ConductorE.Core/Domain/Events.cs (+1 event type)"
    - "rig-gitops: apps/dev-e/*, apps/review-e/* (HelmRelease values)"
  evaluated_by: architect (human, pair mode)
  evaluated_at: "2026-04-16"
acceptance_criteria:
  - "Guard script blocks every pattern in the Safety blocklist"
  - "Guard script allows the explicit exceptions (rm -rf ./local, --force-with-lease)"
  - "Block emits GuardBlocked event to Conductor-E (best-effort)"
  - "install.sh registers guard idempotently"
  - "CI test verifies block/allow coverage stays green"
  - "Dev-E and Review-E HelmReleases ship the guard by default"
  - "rig-tools README and safety.md cross-reference the implementation"
test_strategy:
  unit: "20+ block cases, 15+ allow cases in Bats or plain shell"
  integration: "Synthetic PreToolUse JSON on stdin, verify exit codes"
  deploy: "Canary promotion gated on synthetic block-test against a running pod"
non_goals:
  - "MCP tool scoping (separate story)"
  - "Prompt-injection defense (L7 egress + CaMeL, separate stories)"
  - "Override / bypass flag (explicit design rejection  see safety.md)"
expected_effort_tokens: 60000
assigned_agent: null   # pair mode, not dispatched

Notice: the TaskSpec names three touched repos but fits in one story because the parts are small and tightly coupled. That's normal for phase 0 — foundational pieces span repos.

Issue decomposition

Monolithic vs. split PRs:

Small PRs over one big one

One 500-line PR touching three repos is a review nightmare and blocks fast feedback. Five 50–150-line PRs, each in one repo with clear boundary and dependency, merge faster and roll back cleaner. The cost is a tracking issue + dependency management. At our scale the cost is negligible compared to the benefit.

Five sub-issues + one tracking issue

graph TB
    classDef done fill:#c8e6c9,stroke:#2e7d32,color:#000
    classDef ready fill:#bbdefb,stroke:#1565c0,color:#000
    classDef blocked fill:#ffe0b2,stroke:#e65100,color:#000
    classDef tracker fill:#f3e5f5,stroke:#6a1b9a,color:#000

    T[Tracking issue<br/>rig-gitops]:::tracker
    I1[#1 rig-tools<br/>guard script + unit tests + CI]:::ready
    I2[#2 rig-tools<br/>install.sh integration + README]:::blocked
    I3[#3 conductor-e<br/>GuardBlocked event type + projection]:::ready
    I4[#4 rig-tools<br/>guard emits GuardBlocked]:::blocked
    I5[#5 rig-gitops<br/>deploy guard to Dev-E + Review-E HelmRelease]:::blocked

    T -.->|parent of| I1
    T -.->|parent of| I2
    T -.->|parent of| I3
    T -.->|parent of| I4
    T -.->|parent of| I5

    I1 -->|merge before| I2
    I1 -->|merge before| I4
    I3 -->|merge before| I4
    I2 -->|merge before| I5
    I4 -->|merge before| I5

Dependency summary: - #1 and #3 are independent — can ship in parallel - #2 depends on #1 (install wiring needs the script) - #4 depends on #1 and #3 (emission needs both the script and the event type) - #5 depends on #2 and #4 (deploy needs installable package + working event pipeline) - Tracking (#T) is created first, updated as each sub-issue closes

Per-issue specs

Issue #1 — Guard script + unit tests + CI workflow

Repo: dashecorp/rig-tools Title: feat: add dangerous-command guard script with unit tests Labels: enhancement, agent-ready, area:safety, phase-0, tier:t1 Effort: ~1-2 pair-mode hours Depends on: none (independent) Unblocks: #2, #4

Body sketch:

## Problem

Agents (and humans) running Claude Code can execute destructive shell
commands — `rm -rf /`, `git push --force`, `sudo`, DROP TABLE, kubectl
namespace deletion. No guard exists today. Phase 0 item 1 from the
trusted rig whitepaper.

## Solution

Add `hooks/dangerous-command-guard.sh` that:
- reads Claude Code PreToolUse JSON from stdin
- matches `tool_input.command` against the blocklist documented in
  whitepaper/safety.md
- exits 0 (allow) or exits 2 (block) with a one-line reason

Plus unit tests in `tests/guard.test.sh` covering 20+ block cases and
15+ allow cases. Plus a CI workflow that runs the tests on every PR.

## Acceptance criteria

- [ ] Script file at `hooks/dangerous-command-guard.sh`
- [ ] All blocklist patterns from safety.md are covered
- [ ] All explicit exceptions (rm -rf ./local, --force-with-lease) pass
- [ ] Unit tests in `tests/guard.test.sh` with 20+ block, 15+ allow cases
- [ ] CI workflow `.github/workflows/guard-test.yml` runs tests on PR
- [ ] README.md adds a "Guards" section linking to the script + tests

## Non-goals

- Install / registration wiring (#2)
- Conductor-E event emission (#4)
- Agent runtime deployment (#5)

Test plan: - bash tests/guard.test.sh passes locally - CI green on PR

Issue #2 — install.sh integration + README

Repo: dashecorp/rig-tools Title: feat: wire dangerous-command guard into install.sh Labels: enhancement, agent-ready, area:safety, phase-0, tier:t1 Effort: ~30-60 minutes Depends on: #1 merged Unblocks: #5

Body sketch:

## Problem

Script exists (#1) but users have to register it manually. Automate.

## Solution

Update `install.sh` to:
- copy the guard script to a stable install path (`~/.rig/hooks/`
  by default, override via env var)
- register the guard in `~/.claude/settings.json` as a PreToolUse
  hook, merging with existing settings
- be idempotent — running install twice produces no duplication
- print a clear "installed / already installed" message

Plus README documentation of install flow + uninstall note.

## Acceptance criteria

- [ ] Running `./install.sh` registers the guard in `~/.claude/settings.json`
- [ ] Running it twice is a no-op (idempotent)
- [ ] Uninstall procedure documented (one-line settings.json edit)
- [ ] README "Getting Started" section updated
- [ ] Integration test: fresh install against a temp HOME succeeds

## Non-goals

- Deploying to agent pods (#5)
- Conductor-E wiring (#4)

Issue #3 — GuardBlocked event type + projection

Repo: dashecorp/conductor-e Title: feat: add GuardBlocked event type + projection Labels: enhancement, agent-ready, area:conductor-e, phase-0, tier:t1 Effort: ~1 hour Depends on: none (independent of #1) Unblocks: #4

Body sketch:

## Problem

The guard (in rig-tools) needs a place to report block events so the
rig has visibility into how often destructive commands are being
attempted. Spikes indicate prompt injection, agent bugs, or misconfig.

## Solution

Add to `src/ConductorE.Core/Domain/Events.cs`:

    public record GuardBlocked(
        string AgentId,
        string Repo,
        string Pattern,       // which blocklist pattern triggered
        string Command,       // the command text (truncated to 200 chars)
        DateTimeOffset At
    ) : IRigEvent;

Add a lightweight `GuardBlockedProjection` that tallies counts per
agent × pattern over a rolling 24h window — exposed as
`GET /api/guards/blocks-summary`. This surfaces as a dashboard tile.

Wire the SubmitEvent endpoint to accept the new type.

## Acceptance criteria

- [ ] Event record in Events.cs
- [ ] Projection in Adapters/MartenProjections.cs
- [ ] New endpoint GET /api/guards/blocks-summary returns the projection
- [ ] Unit test submitting a GuardBlocked and verifying the projection
- [ ] Integration test via ConductorEApiFactory

## Non-goals

- Emitting the event from guard.sh (#4)
- Dashboard UI work (follow-up story once we have eyes on the metric)

Issue #4 — Guard emits GuardBlocked

Repo: dashecorp/rig-tools Title: feat: guard emits GuardBlocked to Conductor-E on match Labels: enhancement, agent-ready, area:safety, phase-0, tier:t1 Effort: ~30-60 minutes Depends on: #1 merged AND #3 merged Unblocks: #5

Body sketch:

## Problem

Guard blocks commands (#1) and Conductor-E accepts GuardBlocked
events (#3), but they're not connected.

## Solution

In `hooks/dangerous-command-guard.sh`, on a block match, fire a
GuardBlocked event via the existing `conductor-e-hook` utility
(best-effort, async, with 1s timeout — failure must NOT unblock
the guard).

Update tests/guard.test.sh to assert emission when a block fires
(mock the conductor-e-hook for deterministic testing).

## Acceptance criteria

- [ ] On block, guard POSTs a GuardBlocked event (best-effort)
- [ ] curl failure or Conductor-E unreachable → guard still exits 2
- [ ] Timeout ≤ 1s on the curl
- [ ] Unit test asserts emission using a mocked curl
- [ ] Integration test (optional) against a live Conductor-E in
      devcontainer CI

## Non-goals

- Dashboard UI (follow-up)
- Retry / spool (that's phase 1's hook reliability spool story)

Issue #5 — Deploy guard to Dev-E + Review-E HelmReleases

Repo: dashecorp/rig-gitops Title: feat: mount dangerous-command guard into Dev-E + Review-E pods Labels: enhancement, agent-ready, area:gitops, phase-0, tier:t1 Effort: ~1 hour Depends on: #2 merged AND #4 merged Unblocks: closes the story

Body sketch:

## Problem

Script exists (#1), installs for humans (#2), emits events (#3,#4),
but does not protect agents in the cluster until shipped in the
HelmRelease.

## Solution

Update `apps/dev-e/*-helmrelease.yaml` and `apps/review-e/*-helmrelease.yaml`
to:
- pull the guard via rig-tools devcontainer init container
- mount it at the expected hooks path
- register in the agent's claude settings (already done via install.sh
  in the devcontainer post-create, verify it runs)

Flagger canary stays in front: first deploy to dev-e-node at 10% for
1 hour, promote if SLI holds, then dev-e-dotnet, python, review-e.

## Acceptance criteria

- [ ] Dev-E (node) pod has guard at expected path
- [ ] Dev-E (dotnet) and Dev-E (python) same
- [ ] Review-E same
- [ ] Flagger canary promoted cleanly
- [ ] Synthetic test: trigger a block inside the running pod,
      verify GuardBlocked event arrives in Conductor-E
- [ ] Kyverno admission policy added (in a follow-up T2 story) to
      require the guard annotation on any agent pod

## Non-goals

- Kyverno admission policy requiring the guard (out of scope, separate
  T2 story — we add the guard first, then enforce it)

Tracking issue — rig-gitops

Repo: dashecorp/rig-gitops Title: tracking: dangerous-command guard (phase 0 item 1) Labels: tracking, phase-0, epic Body sketch:

Parent tracking issue for the dangerous-command guard story. Links
into the plan at docs/whitepaper/example-first-story.md.

## Sub-issues

- [ ] dashecorp/rig-tools#NN — guard script + unit tests + CI
- [ ] dashecorp/conductor-e#NN — GuardBlocked event type + projection
- [ ] dashecorp/rig-tools#NN — install.sh integration + README
- [ ] dashecorp/rig-tools#NN — guard emits GuardBlocked
- [ ] dashecorp/rig-gitops#NN — deploy guard to Dev-E + Review-E

## Definition of done for the story

- All 5 sub-issues closed
- Synthetic abuse test passes (see test-plan below)
- Dashboard shows GuardBlocked count (even if zero)
- docs/whitepaper/safety.md cross-references the live implementation

## Abuse test

After everything is deployed, run each of these from inside an agent
shell and verify the guard blocks:

- [ ] `rm -rf /tmp/fake`  → should allow (local path)
- [ ] `rm -rf /` → should block
- [ ] `git push --force-with-lease origin main` → should allow
- [ ] `git push --force origin main` → should block
- [ ] `sudo apt install foo` → should block
- [ ] `kubectl delete namespace foo` → should block
- [ ] `DROP TABLE issues;` via psql → should block
- [ ] Attempting a block while Conductor-E is unreachable → still blocks

Each block emits a GuardBlocked event; query
`GET /api/guards/blocks-summary` to verify the counter incremented.

Rollout sequence

Per the development-process.md release cadence rules for T1:

sequenceDiagram
    participant H as Human+AI (pair)
    participant R1 as dashecorp/rig-tools
    participant R3 as dashecorp/conductor-e
    participant R5 as dashecorp/rig-gitops
    participant P as Flagger canary

    H->>R5: Create tracking issue
    par independent issues
        H->>R1: PR for #1 (guard + tests + CI)
        H->>R3: PR for #3 (event type + projection)
    end
    R1->>R1: CI + human review + merge
    R3->>R3: CI + human review + merge
    par after both merged
        H->>R1: PR for #2 (install wiring)
        H->>R1: PR for #4 (event emission)
    end
    R1->>R1: merge both
    H->>R5: PR for #5 (HelmRelease deploy)
    R5->>R5: merge
    R5->>P: Flux reconciles
    P->>P: 10% canary on dev-e-node
    P->>P: SLI analysis
    P->>P: promote to 50%, then 100%
    P->>H: Promoted
    H->>R5: Close tracking issue

Expected total wall-time: 2-3 pair-mode sessions over 1-2 weeks for a 1-2 person team. Most of the time is the canary promotion wait plus the human review cycle on each PR, not the actual coding.

Definition of done for the whole story

  • [ ] All 5 sub-issues closed
  • [ ] Tracking issue closed
  • [ ] Abuse test suite passes end-to-end in the live cluster
  • [ ] GuardBlocked event count visible on a dashboard (even if zero)
  • [ ] docs/whitepaper/safety.md has a "Status: shipped" note pointing at commit SHAs
  • [ ] Post-mortem note added to docs/ if anything surprised us during the rollout (lesson captured for next story)

What could go wrong (anticipated risks)

Enumerate now, not post-incident:

Risk Mitigation
Guard over-blocks a legitimate command Unit test coverage on allow cases; explicit design doc in safety.md names allowed variants
Guard's curl to Conductor-E hangs and slows every Bash call 1-second timeout on curl; async emission; guard exits 2 regardless of curl result
Install script breaks someone's existing ~/.claude/settings.json Idempotent merge, backup the original, integration test with a real settings.json
Agent pod comes up without the guard after a Flux reconcile Flagger SLI includes a synthetic block-test; alert if event count is zero for >1h after deploy (means guard may have been bypassed)
Event flood from a looping agent that keeps hitting the guard Conductor-E GuardBlocked projection has a rate-limit window; if >N/min, alert once then suppress
Humans already running Claude Code see unexpected block on a command they use Docs include the uninstall path (one-line settings.json edit); acknowledge in rollout announcement

What this example is NOT

This is one T1 story of ~250 the trusted rig needs to ship end-to-end. It is not:

  • A template for T2 stories (those need interface-approval gate)
  • A template for T3 stories (those need two-attestor + human driver)
  • A template for larger epics (use Spec Kit .specify/ layout for those)
  • A prescription of exactly how many sub-issues every story should have (some stories are 1 PR, some are 8; let the dependency graph decide)

What the template is

The pattern here is reusable:

  1. Start with a story in plain-English user-story shape
  2. Write a TaskSpec — tier, blast radius, acceptance criteria, non-goals, expected effort, surfaces touched
  3. Draw the dependency graph — what must ship before what
  4. Split into issues sized for a small PR each (roughly 50-150 lines of change)
  5. Per issue: title, labels, effort, depends-on, unblocks, body with acceptance criteria
  6. Tracking issue on the primary repo with a checklist
  7. Rollout sequence that respects the dependency graph
  8. Definition of done for the whole story, not just each issue
  9. Anticipated risks with mitigations named before any PR opens

Every future user story can follow this nine-step shape. It's the discipline that keeps "architect writes doc" and "implementers write code" aligned.

What happens after this story

Once the dangerous-command guard ships, phase 0 is ~20% done. The remaining phase-0 stories in recommended order:

  1. Agent identity in git — trivial HelmRelease env vars. Unblocks attribution.
  2. Egress NetworkPolicy (requires Cilium L7 check first) — biggest prompt-injection defense.
  3. Git worktrees — smaller per-pod footprint when KEDA scales to >1.
  4. Hook reliability spool — the last piece before Era 2 can begin safely.

Each of those gets its own planning doc like this one — same nine-step shape, different content.

See also