Skip to content

exterminate

exterminate

standalonewriteshands-off

Use this when: you want the bug backlog cleared autonomously

Exterminate

Clears the bug backlog while the maintainer sleeps, surfacing only what genuinely needs a human. Thin by design: this skill owns the batch — sweep, lock, dispatch, merge-gate, close-loop. Everything about fixing one bug (reproduce-first, rigour routing, proof gates, escalation) lives in debug's autonomous mode; never duplicate it here.

Model tiering: the orchestrator runs capable; each per-bug worker runs cheap (worktree-builder with a model: override). Safety comes from deterministic proof gates — red-genuinely-failed, mutation-kills-mutant, browser/UI-contract proof, path checks, the 80/90 CI gate — not model IQ: a weak fix fails a gate and never reaches the merge lane.

Config lives in config/harness.phpexterminate (the toggle + every ceiling); dialing autonomy back is one flag, no redesign.

The loop

  1. Reconcile first (self-heal drift): close any merged-but-still-open bug (link its PR), release locks whose branch/PR is gone, skip anything with an open PR, and clear stale claims — a claim older than stale_lock_hours with no PR is reclaimable (a crashed run never wedges a bug).
  2. Sweep the intake — open issues typed Bug, plus bug-shaped untriaged issues (an error/broken-behaviour report not yet typed). For the bug-shaped-but-unlabelled, do a light "is this actually a bug?" read: clearly yes → type it Bug and proceed; ambiguous → escalate (Ready for Human), never guess. Skip Needs Info / Ready for Human / claimed issues.
  3. Order by severity: sev:high first, unlabelled next, sev:low last. Severity is a hand-set hint — it orders the queue and sets the merge bar; it never alone authorises anything.
  4. Claim (the in-flight lock): move Status → In Progress + self-assign, atomically before any work. A second run skips claimed issues.
  5. Plan, thinly — which bug, its severity, the repro stub from the issue template, the lock. Nothing about how to fix it — that's debug's job.
  6. Dispatch one cheap-model worktree-builder worker per bug, brief = the thin plan + "run the debug skill in autonomous mode". Disjoint bugs may run in parallel worktrees; keep frontend-overlapping bugs sequential. Worker escalations (any rung of debug's ladder) bubble up: flip the issue to Ready for Human / Needs Info, release the lock, push a notification.
  7. Per-bug PR — branch bug/<#>-slug (never sprint/), body = debug's RCA report-card + Closes #NN.
  8. Merge gate — see below. The relocated surface-then-confirm lives here.
  9. Close the loop — native close-on-merge fires (Closes #NN on a PR to main); the close-bug-issues Action backstops the phantom-open case and releases the lock. Never hand-close.
  10. Stop conditions (the batch circuit-breaker): pause the run when open auto-PRs reach max_open_prs, surfaced decisions reach max_surfaced_decisions, or max_consecutive_escalations fire in a row (that smells like one systemic cause — recommend a single investigation, not N patches). On pause: push a notification with the queue position.

The merge gate (the safety floor)

Evaluated per PR, on the actual diff — deterministic checks, never the sev label alone:

  • Protected surfaces — never auto-merged, toggle irrelevant. If the diff touches any protected_paths entry (outbound clients Todoist/Google/push, resources/views/prompts/**, model $hidden/$fillable/casts(), CredentialStore, app/Ai/Tools/**, auth/routes, migrations — kept in sync with docs/security/threat-model.md B1–B7): hold for the human, always.
  • Auto-merge (only while auto_merge is true) requires ALL of: CI green (the 80/90 gate) · path-clean (no protected surface) · narrow codegraph_impact · not sev:high · for presentation-lane fixes, a passing browser/UI-contract assertion in the PR. Fail closed on any ambiguity → hold.
  • Ceilings: stop auto-merging after max_consecutive_auto_merges in one run, and auto-pause auto-merge entirely (flip the config flag off + notify) if any auto-merged bug is reverted or reopened.
  • Held PRs are the human's review queue — that is the relocated confirm.

Auto-merge ON is the maintainer's recorded, eyes-open waiver of the CLAUDE.md "open ≠ merge" rule (2026-06-11, "try it, then dial back") — scoped to this hardened lane only. Dial back = auto_merge => false → every PR is held (auto-PR-you-merge).

Where it sits

  • debug — the per-bug engine this loops; all fix logic lives there.
  • build-sprint — executes planned issues; this clears the unplanned.
  • triage — classifies intake + sets sev:; the sweep leans on its work and falls back to a light read for the unlabelled.
  • Trigger — scheduled + unwatched (see the trigger spike, docs/brainstorms/2026-06-10-bug-handling-model/spikes/trigger-feasibility.md); also runnable on demand ("run the exterminator").