Skip to content

Insert prompt cache tokens in ChatDatabricks response metadata#388

Open
viadark wants to merge 2 commits intodatabricks:mainfrom
viadark:extract-prompt-caching-data
Open

Insert prompt cache tokens in ChatDatabricks response metadata#388
viadark wants to merge 2 commits intodatabricks:mainfrom
viadark:extract-prompt-caching-data

Conversation

@viadark
Copy link
Copy Markdown

@viadark viadark commented Mar 23, 2026

Summary

  • Databricks Model Serving returns prompt caching token info (Anthropic: cache_creation_input_tokens, cache_read_input_tokens / OpenAI: prompt_tokens_details.cached_tokens) in CompletionUsage,
    but ChatDatabricks was discarding them from llm_output and streaming UsageMetadata
  • Extract and insert cache tokens in both invoke (_convert_response_to_chat_result) and stream (_extract_completion_usage_from_chunk, _build_usage_chunk_from_completions) paths

Changes

chat_models.py

Method Change
_convert_response_to_chat_result Add Anthropic/OpenAI cache tokens to llm_output["usage"] and top-level llm_output
_extract_completion_usage_from_chunk Include cache tokens in dict fallback path
_build_usage_chunk_from_completions Build InputTokenDetails with cache info in dict fallback path

test_chat_models.py

Test Description
test_convert_response_to_chat_result_anthropic_cache_tokens Verify Anthropic cache tokens in llm_output
test_convert_response_to_chat_result_openai_cache_tokens Verify OpenAI cached_tokens in llm_output
test_convert_response_to_chat_result_no_cache_tokens Verify no cache keys when absent
test_chat_databricks_stream_with_claude_cache_tokens Verify Claude cache tokens in streaming UsageMetadata

minseok2.kang added 2 commits March 23, 2026 15:12
extracts cache tokens into llm_output for both invoke and stream path.
Changed to use the data below
id=_MOCK_CHAT_RESPONSE["id"]
created=_MOCK_CHAT_RESPONSE["created"]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant