Appearance
Guardrails
The harness lets an agent act autonomously without letting it act dangerously. Three mechanisms do the work.
Tiered surface-then-confirm
The core rule: reads are free, writes are gated. A pure-read skill (find-untracked-work, qa-code, check-reasoning, vet) just reports — no confirm tax. A skill that writes — edits a canonical doc, files an issue, saves to memory, pushes — proposes and lets you pick; it never acts unprompted. The tier is set by consequence, not name: the more irreversible the action, the firmer the gate.
Onion-ordered guardrails
When a skill stacks several checks, they compose cheapest / most-general outermost → irreversible-action approval innermost — like an onion. A cheap, general check rejects bad input before an expensive or consequential one ever runs. The same ordering governs MCP tools: a read-only tool runs unprompted; a state-changing one confirms first.
The gate and the Definition of Done
Code isn't done because it works — it's done when both gates are green. The first is composer ci:check: PHPStan level 7, coverage at least 80%, the deterministic security checks, lint / format / types all clean. The second is the mutation gate, which runs separately on CI (not in ci:check, because running it locally would exhaust memory): the diff-scoped PR check and the nightly full run both hold a mutation score of at least 95%. "Done with known issues" is not a mergeable state. Thresholds are never lowered to go green — a surviving mutant means add a test, not relax the gate. Out-of-scope defects become tracked issues, not shipped caveats.
Circuit-breakers
Where a loop can run away, something stops it. A scheduled task that fails three times in a row auto-disables itself and pushes a notification. A debug session that fails three times stops piling on patches and questions the architecture instead. This is the spine clause rollback before autonomy in practice: you don't automate what you can't halt.