Insert prompt cache tokens in ChatDatabricks response metadata by viadark · Pull Request #388 · databricks/databricks-ai-bridge

viadark · 2026-03-23T07:14:20Z

Summary

Databricks Model Serving returns prompt caching token info (Anthropic: cache_creation_input_tokens, cache_read_input_tokens / OpenAI: prompt_tokens_details.cached_tokens) in CompletionUsage,
but ChatDatabricks was discarding them from llm_output and streaming UsageMetadata
Extract and insert cache tokens in both invoke (_convert_response_to_chat_result) and stream (_extract_completion_usage_from_chunk, _build_usage_chunk_from_completions) paths

Changes

`chat_models.py`

Method	Change
`_convert_response_to_chat_result`	Add Anthropic/OpenAI cache tokens to `llm_output["usage"]` and top-level `llm_output`
`_extract_completion_usage_from_chunk`	Include cache tokens in dict fallback path
`_build_usage_chunk_from_completions`	Build `InputTokenDetails` with cache info in dict fallback path

`test_chat_models.py`

Test	Description
`test_convert_response_to_chat_result_anthropic_cache_tokens`	Verify Anthropic cache tokens in `llm_output`
`test_convert_response_to_chat_result_openai_cache_tokens`	Verify OpenAI `cached_tokens` in `llm_output`
`test_convert_response_to_chat_result_no_cache_tokens`	Verify no cache keys when absent
`test_chat_databricks_stream_with_claude_cache_tokens`	Verify Claude cache tokens in streaming `UsageMetadata`

extracts cache tokens into llm_output for both invoke and stream path.

Changed to use the data below id=_MOCK_CHAT_RESPONSE["id"] created=_MOCK_CHAT_RESPONSE["created"]

minseok2.kang added 2 commits March 23, 2026 15:12

retrieve prompt cache token from ChatDatabricks

5aad9db

extracts cache tokens into llm_output for both invoke and stream path.

Changed to use _MOCK* when creating ChatCompletion

884d1c3

Changed to use the data below id=_MOCK_CHAT_RESPONSE["id"] created=_MOCK_CHAT_RESPONSE["created"]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Insert prompt cache tokens in ChatDatabricks response metadata#388

Insert prompt cache tokens in ChatDatabricks response metadata#388
viadark wants to merge 2 commits intodatabricks:mainfrom
viadark:extract-prompt-caching-data

viadark commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

viadark commented Mar 23, 2026

Summary

Changes

chat_models.py

test_chat_models.py

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`chat_models.py`

`test_chat_models.py`