Summary
Amazon Bedrock added support for a 1-hour TTL option for prompt caching on January 26, 2026 (AWS announcement). Currently, the gateway uses the default 5-minute TTL with no way to configure it. This feature request proposes adding a PROMPT_CACHE_TTL environment variable to allow users to opt into the extended 1-hour cache.
Motivation
For long-running agentic sessions or applications with large, stable system prompts, the 5-minute TTL expires frequently during idle gaps — triggering unnecessary cacheWrite charges. A 1-hour TTL would significantly reduce costs for these workloads.
Proposed Change
- Add
PROMPT_CACHE_TTL to src/api/setting.py with default "5m" and valid values "5m" or "1h"
- Pass the TTL field through to the
cache_control block in Bedrock requests
- Restrict 1-hour TTL to supported models only (Claude Haiku 4.5, Sonnet 4.5, Opus 4.5)
- Document the new variable in README alongside
ENABLE_PROMPT_CACHING
Supported Models (1h TTL)
Per AWS docs:
claude-haiku-4-5
claude-sonnet-4-5
claude-opus-4-5
No Breaking Changes
Default remains "5m" — existing deployments are unaffected.
Summary
Amazon Bedrock added support for a 1-hour TTL option for prompt caching on January 26, 2026 (AWS announcement). Currently, the gateway uses the default 5-minute TTL with no way to configure it. This feature request proposes adding a
PROMPT_CACHE_TTLenvironment variable to allow users to opt into the extended 1-hour cache.Motivation
For long-running agentic sessions or applications with large, stable system prompts, the 5-minute TTL expires frequently during idle gaps — triggering unnecessary
cacheWritecharges. A 1-hour TTL would significantly reduce costs for these workloads.Proposed Change
PROMPT_CACHE_TTLtosrc/api/setting.pywith default"5m"and valid values"5m"or"1h"cache_controlblock in Bedrock requestsENABLE_PROMPT_CACHINGSupported Models (1h TTL)
Per AWS docs:
claude-haiku-4-5claude-sonnet-4-5claude-opus-4-5No Breaking Changes
Default remains
"5m"— existing deployments are unaffected.