Skip to content

optimise-prompts

optimise-prompts

optimisewriteshands-on

Use this when: a prompt is bloated and needs tightening

Problem it solves — Prompts bloat and drift over time. This tightens a prompt for clarity and token cost while preserving its behaviour.

Optimise prompts (per-call cost)

A skill description is paid once per session; a prompt is paid on every single AI call. At Tempo's volume (classification per task, the helper, threads, the nudge) the prompt body is the recurring spend — so the same "bytes that buy nothing" lens as optimise-context, aimed at runtime instead of the harness.

The lenses — this is the skill

Measure first (rough token count per prompt; flag the heaviest and the hottest-path ones), then:

  • Verbosity — instructions a tighter line carries, restated constraints, ceremony the model doesn't need. Cut what doesn't change the output.
  • Cache-stability — keep the static instruction prefix stable and put the volatile interpolation (the task text, today's data, per-call values) at the end. A prompt whose first tokens change every call can never hit the prompt cache; reordering so the cacheable prefix is constant is often the biggest win, bytes aside.
  • Shared-block dedupe — the same persona / rules / schema preamble copied across several prompts → extract a Blade partial and @include it, so one edit fixes all and the cache prefix is identical across calls.
  • Right-sized examples — few-shot examples that no longer pull their weight (the model handles it zero-shot now) are pure per-call cost.

Never cut an instruction that changes behaviour to save tokens — that's a regression. Pair with qa-prompts so a trim doesn't break the output contract.

Output — surface, then trim

A table: prompt file:line · current ≈tokens · the trim / reorder · what's preserved · cache impact. AskUserQuestion which to apply; apply only the confirmed, then have qa-prompts confirm the contract still holds.

Where it sits

  • The runtime sibling of optimise-context (always-on harness footprint) and optimise-code (app performance) — same verb, different surface.
  • Not qa-prompts (correctness + injection safety). Trim here, then qa there.