fix(bedrock): cache trailing message for stable prefix across agent turns#8916
Open
carl-auctane wants to merge 1 commit intoaaif-goose:mainfrom
Open
fix(bedrock): cache trailing message for stable prefix across agent turns#8916carl-auctane wants to merge 1 commit intoaaif-goose:mainfrom
carl-auctane wants to merge 1 commit intoaaif-goose:mainfrom
Conversation
…urns The BEDROCK_ENABLE_CACHING=true path currently places a cache point on the first three visible messages. This does not match how Anthropic and Bedrock look up cached prefixes. Cache entries are keyed by the hash of the prefix ending at the breakpoint, and reads walk backward up to 20 blocks looking for prior writes. With the first-3 strategy, each new turn's breakpoint sits at a fixed position early in the conversation, so everything appended after it is reprocessed fresh on every turn. In an agentic tool-use loop this grows linearly with turn count. Place the cache point on the trailing message instead. On each new turn the lookback finds the breakpoint the previous turn wrote, so fresh processing is bounded to the content added since the last request. This matches the pattern Anthropic's prompt caching documentation recommends for growing conversations. See https://platform.claude.com/docs/en/build-with-claude/prompt-caching and https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html Signed-off-by: Carl Youngblood <carl.youngblood@auctane.com>
Bojun-Vvibe
added a commit
to Bojun-Vvibe/oss-contributions
that referenced
this pull request
Apr 29, 2026
- aaif-goose/goose#8916 fix(bedrock): cache trailing message for stable prefix across agent turns (merge-as-is) - aaif-goose/goose#8904 fix(oidc-proxy): validate exp independently of MAX_TOKEN_AGE_SECONDS (merge-as-is — security fix with test inversion in same commit)
Bojun-Vvibe
added a commit
to Bojun-Vvibe/oss-contributions
that referenced
this pull request
Apr 30, 2026
Gemini-cli MessageBus.request() fail-fast on publish failure (fixes #22588 60s silent hang) and goose Bedrock prompt-cache placement fix (move cache_control from first-three to trailing message to align with prefix-keyed 20-block lookback). INDEX appended with drip-196 verdict-mix and PR table.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When
BEDROCK_ENABLE_CACHING=true,BedrockProvider::converse()currently places cache points on the first three visible messages. This doesn't match how Anthropic and Bedrock actually look up cached prefixes: cache entries are keyed by the hash of the prefix ending at the breakpoint, and reads walk backward up to 20 blocks looking for prior writes. With a breakpoint fixed to early messages, every turn reprocesses everything appended after position 3, which in an agent loop grows linearly with turn count.This change places the cache point on the trailing message instead. On each new turn the lookback finds the breakpoint the previous turn wrote, so fresh processing is bounded to the content added since the last request. This matches the pattern Anthropic's prompt caching documentation recommends for growing conversations:
The system-prompt cache point is unchanged. The misleading comment about "caching recent messages would shift positions each turn" is replaced with an accurate description of the lookup model.
For a worked cost comparison over a 10-turn agent loop and links to the relevant Anthropic and Bedrock docs, see the linked issue.
Testing
cargo fmt --allcargo check -p goosecargo clippy -p goose --all-targets -- -D warnings(no warnings)cargo test -p goose --lib providers::bedrock(4 passed)cargo test -p goose --lib providers::formats::bedrock(11 passed)No test changes were needed. The existing per-message helper tests in
providers::formats::bedrockexerciseto_bedrock_message_with_cachingdirectly withenable_caching=trueand are unaffected. Thetest_caching_*tests inproviders::bedrockassert onshould_enable_caching()returning the right boolean and are also unaffected.Related Issues
Relates to #8915