docs: claude prompt caching note (#6429)

dianed-square · web-flow · commit fa1d87426981 · 2026-01-09T17:00:53.000-08:00
diff --git a/documentation/docs/getting-started/providers.md b/documentation/docs/getting-started/providers.md
@@ -42,6 +42,10 @@ goose is compatible with a wide range of LLM providers, allowing you to choose a
 | [Venice AI](https://venice.ai/home)                                         | Provides access to open source models like Llama, Mistral, and Qwen while prioritizing user privacy. **Requires an account and an [API key](https://docs.venice.ai/overview/guides/generating-api-key)**.                 | `VENICE_API_KEY`, `VENICE_HOST` (optional), `VENICE_BASE_PATH` (optional), `VENICE_MODELS_PATH` (optional)                                                                          |
 | [xAI](https://x.ai/)                                                        | Access to xAI's Grok models including grok-3, grok-3-mini, and grok-3-fast with 131,072 token context window.                                                                                                            | `XAI_API_KEY`, `XAI_HOST` (optional)                                                                                                                                                |
 
+:::tip Prompt Caching for Claude Models
+goose automatically enables Anthropic's [prompt caching](https://platform.claude.com/docs/en/build-with-claude/prompt-caching) when using Claude models via Anthropic, Databricks, OpenRouter, and LiteLLM providers. This adds `cache_control` markers to requests, which can reduce costs for longer conversations by caching frequently-used context. See the [provider implementations](https://github.com/block/goose/tree/main/crates/goose/src/providers) for technical details.
+:::
+
 ### CLI Providers
 
 goose also supports special "pass-through" providers that work with existing CLI tools, allowing you to use your subscriptions instead of paying per token: