Skip to content

qa-prompts

qa-prompts

qareadhands-off

Use this when: a prompt needs auditing for gaps + failure modes

Problem it solves — A prompt can be subtly wrong in ways that only bite at runtime. This audits a prompt for the gaps and failure modes before it ships.

QA prompts (the AI prompt layer)

The prompts are where the model's behaviour actually lives — a vague role or a schema that drifts from the parser is a runtime bug that no PHP test catches. This audits the prompts as the contracts they are.

The corpus

resources/views/prompts/*.blade.php — system + user templates for each AI service (classification, the task helper, threads, the morning nudge, …). They inject $persona / PersonalContext, loop skills, and declare JSON output shapes; untrusted data rides behind @untrustedPreamble.

The lenses — this is the skill

For each prompt the diff touches (or all of them on a full audit):

  • Injection safety — does every prompt that interpolates external / untrusted data (task text, email bodies, calendar entries) carry the @untrustedPreamble guard, and is the untrusted block clearly fenced from the instructions? A prompt that drops user-controlled text straight into the instruction body is the vulnerability.
  • Output contract — the JSON shape the prompt asks for must match what the AI service then parses (keys, types, enums, required-vs-optional). A field renamed in one and not the other is a silent break.
  • Context injection$persona / PersonalContext actually present where the call needs it (the nudge and the helper are personalised — that's the product); no stale or duplicated injection.
  • Prompt-engineering hygiene — a clear role/goal up front, instructions before the data, constraints explicit, examples only where they earn their place, no contradictory directives. Lead the model, don't bury the ask.

Output — surface, then fix

A table: prompt file:line · issue · which lens · suggested fix, injection-safety findings flagged first (they're the critical class). AskUserQuestion which to apply; apply only the confirmed. A clean set is a valid result.

Where it sits

  • Not optimise-prompts — that trims token/clarity/cache cost; this checks correctness and safety. Run both before shipping a prompt change.
  • Not qa-code — that audits the PHP (the AI service class, DTOs, mocking); this audits the Blade prompt the service renders.
  • Pairs with qa-code whenever an AI service changes: qa-code for the wiring, qa-prompts for the prompt.