Skip to content

feat: add prompt caching schema support for system/developer messages#230

Open
dcox761 wants to merge 1 commit intoaws-samples:mainfrom
bluecrystalsolutions:feature/prompt-caching-schema
Open

feat: add prompt caching schema support for system/developer messages#230
dcox761 wants to merge 1 commit intoaws-samples:mainfrom
bluecrystalsolutions:feature/prompt-caching-schema

Conversation

@dcox761
Copy link
Copy Markdown

@dcox761 dcox761 commented Mar 3, 2026

*Issue #229 *

Description of changes:

Add CacheControl model and cache_control field on TextContent to support OpenAI-compatible prompt caching hints (e.g., cache_control: {type: 'ephemeral'}).

Changes:

  • schema.py: Add CacheControl model; add cache_control field to TextContent; change SystemMessage.content and DeveloperMessage.content from str to str | list[TextContent] to accept structured content with cache hints
  • bedrock.py: Update _parse_system_prompts to handle list-format content with cache_control, emitting Bedrock cachePoint blocks; update _parse_content_parts to emit cachePoint for user message content parts

This allows clients to mark specific content blocks for caching using the standard OpenAI cache_control field, which is translated to Bedrock's cachePoint tagged union format.

Ref: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Add CacheControl model and cache_control field on TextContent to support
OpenAI-compatible prompt caching hints (e.g., cache_control: {type: 'ephemeral'}).

Changes:
- schema.py: Add CacheControl model; add cache_control field to TextContent;
  change SystemMessage.content and DeveloperMessage.content from str to
  str | list[TextContent] to accept structured content with cache hints
- bedrock.py: Update _parse_system_prompts to handle list-format content
  with cache_control, emitting Bedrock cachePoint blocks; update
  _parse_content_parts to emit cachePoint for user message content parts

This allows clients to mark specific content blocks for caching using the
standard OpenAI cache_control field, which is translated to Bedrock's
cachePoint tagged union format.

Ref: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
@zxkane
Copy link
Copy Markdown
Member

zxkane commented Mar 6, 2026

Thanks for the PR! The SystemMessage.content: str | list[TextContent] change is a welcome fix — it aligns our schema with the official OpenAI API spec, which does support string | array of ChatCompletionContentPartText for system/developer messages.

I had a question about the cache_control part of this PR though. After checking the OpenAI API reference, it appears that cache_control is not part of the official ChatCompletionContentPartText schema — OpenAI's prompt caching is automatic and transparent (prefix-based, 1024+ tokens), with no client-side hints needed.

The cache_control: {type: "ephemeral"} pattern originates from the Anthropic API. In frameworks like LangChain, it's only used with Anthropic-specific integrations (ChatAnthropic, ChatOpenRouter for Anthropic models), not with ChatOpenAI. Since this gateway is designed to receive OpenAI SDK-formatted requests, standard OpenAI SDK clients wouldn't send this field.

Could you share more about the use case you have in mind? For example:

  • Are there specific clients or frameworks that send cache_control in OpenAI-compatible format to this gateway?
  • Is this intended for users who craft raw HTTP requests with mixed OpenAI/Anthropic conventions?

Understanding the real-world scenario would help us evaluate the best approach. If there is a valid use case, we'd want to make sure the cachePoint injection respects the existing ENABLE_PROMPT_CACHING flag and _supports_prompt_caching() checks to avoid bypassing the current caching governance.

Again, the content: str | list[TextContent] fix for system/developer messages is great regardless — happy to see that part land!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants