Skip to content

fix(engine): deliver the final-step hint as a tail message, not a system append#408

Merged
mgoldsborough merged 2 commits into
mainfrom
fix/final-step-cache-stability
Jun 10, 2026
Merged

fix(engine): deliver the final-step hint as a tail message, not a system append#408
mgoldsborough merged 2 commits into
mainfrom
fix/final-step-cache-stability

Conversation

@mgoldsborough

Copy link
Copy Markdown
Contributor

Problem

On a run's last allowed iteration the engine appended a "this is your final step, wrap up" instruction to the system prompt:

if (iteration === maxIter - 1) callPrompt += "\n\n[IMPORTANT: This is your final step. ...]";

That mutates the cached system block. Since system is position 1 in Anthropic's tools → system → messages cache order, changing it invalidates the system breakpoint and the entire message prefix after it — so the final call of every run does a full-prefix re-write instead of reading the cached prefix. On a long conversation that's one expensive call per run.

Fix

Deliver the same hint as a tail message (the volatile 5-minute cache region from the TTL-tiering change), wrapped in <system-reminder>, leaving the stable system prefix byte-identical. The final call now reads the prefix from cache; only the small tail is new.

  • Merges into a trailing user turn when the last message is already a user message (avoids consecutive-user role errors); otherwise appends a fresh user message.
  • callPrompt is no longer mutated, so it's now const.

Tests

Updates the wrap-up test to assert the system prompt is byte-identical across all iterations (the cache-stability property) and the hint appears on the final turn's tail message, not the system block. Full unit suite (3,391) green; verify:static green.

Context

This is the last of the cheap cache-stability cleanups from the cost audit. The other candidate — pinning the compose.ts date — was dropped: it renders a date (toLocaleDateString), so it only busts the system block on conversations that span midnight (effectively never), and moving it properly belongs to the larger stable/volatile prompt partition (deferred until telemetry justifies it). The supervisor-tripped-tool mutation was also dropped — the forensics showed it wasn't a cost driver and gating it would weaken a safety mechanism.

…tem append

On a run's last allowed iteration the engine appended "This is your final
step..." to the system prompt. That mutates the cached system block, which busts
its 1-hour breakpoint AND the whole message prefix after it (system is position
1 in Anthropic's tools→system→messages order) — a full-prefix re-write on the
final call of every run.

Deliver the hint as a tail message (the volatile 5-minute cache region) instead,
wrapped in <system-reminder>. The stable system prefix stays byte-identical, so
the final call reads it from cache. Merge into a trailing user turn when present
to avoid consecutive user messages; otherwise append a fresh one.

Updates the wrap-up test to assert the system prompt is byte-stable across all
iterations and the hint appears on the final turn's tail message.
Address QA on #408:
- Add a maxIterations:1 test for the merge branch. The existing wrap-up test's
  tool-calling model always leaves a tool message as the tail, so it only
  exercised the append-fresh-user branch. With maxIterations:1 (delegated
  children / automations) iteration 0 is final and the tail is the initial user
  prompt, so the merge-into-trailing-user branch (the ...last.content spread)
  runs. Asserts the hint merges into the user turn's content array (two blocks,
  one user message) and the system prompt stays clean.
- Kept the merge branch: relying on each provider's undocumented consecutive-
  user merging for correctness is a fragile cross-provider dependency; the
  explicit merge is provider-agnostic and self-documenting.
- Fix the CHANGELOG: the earlier edit replaced the #401 step-anchor bullet's
  header with the final-step heading and left its body, conflating two changes.
  Split into separate bullets and restore the step-anchor heading.
@mgoldsborough mgoldsborough added the qa-reviewed QA review completed with no critical issues label Jun 10, 2026
@mgoldsborough mgoldsborough merged commit 611fad7 into main Jun 10, 2026
5 checks passed
@mgoldsborough mgoldsborough deleted the fix/final-step-cache-stability branch June 10, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

qa-reviewed QA review completed with no critical issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant