Skip to content

Orphan Issue Detection

Problem

Conductor's dispatch model is pull-based: agents poll GET /api/assignments/next and receive a capability-matched issue. This works well at scale but has a blind spot: if no polling agent matches an issue (e.g. the target repo isn't in any agent's scope), the issue sits in the queue indefinitely with no alert.

Example: dashecorp/rig-memory-mcp#3 was labeled agent-ready on 2026-04-17 and sat four days unclaimed without anyone being paged.

Solution

OrphanScanService runs as a background service inside conductor, scanning the issue queue every hour and applying a dead-letter pattern.

Thresholds

Age OrphanStatus Action
> 24h null Alert Discord #admin, apply orphan label, emit ISSUE_ORPHANED
> 48h "orphan" Escalate: alert #admin at high urgency, apply needs-human, remove agent-ready, emit ISSUE_NEEDS_HUMAN
Any "needs_human" No further action — human must decide

Labels

Label Color Meaning
orphan #cc317c (magenta) No agent has claimed this in >24h
needs-human #e11d48 (red) No agent has claimed in >48h; auto-dispatch stopped

Both labels are created automatically in the target repo on first use.

Events

Two new domain events are emitted into the issue's stream:

ISSUE_ORPHANED   — first orphan detection (>24h)
ISSUE_NEEDS_HUMAN — escalation (>48h, still unclaimed)

These events update the IssueStatus.OrphanStatus read model field: - "orphan" after ISSUE_ORPHANED - "needs_human" after ISSUE_NEEDS_HUMAN - null when the issue is assigned or re-approved (automatically reset)

Read Model Fields Added to IssueStatus

Field Type Description
QueuedAt DateTimeOffset When the issue last entered "queued" state. Reset on IssueApproved and IssueUnassigned.
OrphanStatus string? null, "orphan", or "needs_human". Reset to null on IssueAssigned.

Discord Alerts

24h alert (🔶)

🔶 Orphan issue detected — `rig-memory-mcp#3`
**"feat: add memory store"**
Queue age: 4.1d — no agent has claimed this issue.
https://github.com/dashecorp/rig-memory-mcp/issues/3
Label `orphan` applied. If no agent picks this up within 24h it will escalate to `needs-human`.

48h escalation (🚨)

🚨 ESCALATION — needs human — `rig-memory-mcp#3`
**"feat: add memory store"**
Queue age: 5.0d — no agent has claimed this issue after 48h.
https://github.com/dashecorp/rig-memory-mcp/issues/3
Labels: `agent-ready` removed, `needs-human` applied. Auto-dispatch stopped — human must decide.

Configuration

Env Var Required Description
DISCORD_ADMIN_WEBHOOK_URL Optional Webhook for #admin alerts. If unset, alerts are logged only.
GITHUB_TOKEN Optional Token for label operations. If unset, labels are skipped.

Both degrade gracefully — missing env vars disable that action without crashing the service.

Implementation

File Role
src/ConductorE.Api/Services/OrphanScanService.cs Background service — scans queue, applies labels, posts Discord alerts
src/ConductorE.Core/Domain/Events.cs IssueOrphaned, IssueNeedsHuman event records
src/ConductorE.Core/Domain/ReadModels.cs QueuedAt, OrphanStatus fields on IssueStatus
src/ConductorE.Api/Adapters/MartenProjections.cs Projection handlers for new events and fields
src/ConductorE.Core/UseCases/SubmitEvent.cs ISSUE_ORPHANED / ISSUE_NEEDS_HUMAN event mapping

Multi-tenancy (PR-4)

OrphanScanService runs its scan once per active tenant via ITenantWorkRunner.ForEachTenantAsync: each tenant gets its own DI scope with ITenantContext.TenantId set, so the IIssueQuery read and the ISSUE_ORPHANED / ISSUE_NEEDS_HUMAN SubmitEvent writes bind to that tenant's database. In single-tenant operation today this runs exactly once (for invotek) — behaviour-neutral — but it means a tenant's orphan detection only ever sees and writes that tenant's queue. The service no longer opens its own DI scope; the tenant-scoped provider is supplied by the runner.

Scan Cycle

OrphanScanService (hourly)
  └─ for each active tenant (ITenantWorkRunner):
       └─ GetQueueAsync()          ← all state=="queued" issues
       └─ for each issue:
            QueuedAt == default → skip  (pre-feature issues)
            age < 24h           → skip
            OrphanStatus==null  → HandleOrphan()
            OrphanStatus==orphan AND age≥48h → HandleEscalation()
            OrphanStatus==needs_human → skip (human handles it)

Non-Goals

  • Does not switch to push-based assignment (pull model is correct).
  • Does not implement a capability registry — this is an observability layer, not routing.
  • Does not auto-reassign orphaned issues to random agents; relaxed-match dispatch is a separate concern.