pydantic · dsfaccini · Mar 24, 2026 · Mar 25, 2026 · Mar 25, 2026 · Mar 25, 2026
diff --git a/DOCS-QUESTIONS.md b/DOCS-QUESTIONS.md
@@ -0,0 +1,63 @@
+# Capabilities Docs — Uncovered User Questions
+
+Questions users are likely to have that the current `docs/capabilities.md` doesn't cover (or barely touches).
+
+## 1. What's the generic type parameter on `AbstractCapability[T]`?
+
+Every example uses `AbstractCapability[Any]`. Users will wonder: what is `T`? How does it relate to the agent's dependency type? Can a capability declare that it needs specific dependencies, and what happens if the agent's dep type doesn't match?
+
+## 2. Can capabilities be added per-run?
+
+The docs only show capabilities at `Agent(...)` construction time. Users will want to know: can I pass extra capabilities to `agent.run()` or `agent.iter()`? Or override/disable specific ones for a single run?
+
+## 3. `BuiltinTool`, `Toolset`, and `HistoryProcessor` — how to use them?
+
+These three are listed in the built-in table but have zero documentation beyond the one-line row. A user reading top-to-bottom will see `PrepareTools` and `PrefixTools` get dedicated sections but these don't.
+
+## 4. `ImageGeneration` local fallback
+
+`WebSearch` auto-detects DuckDuckGo, `MCP` auto-detects transport — but what does `ImageGeneration(local=...)` expect? There's no example of providing a local image generation fallback.
+
+## 5. When to use a capability vs direct agent parameters
+
+The intro paragraph hints at this ('instructions and model settings are configured directly...'), but there's no decision framework. When should I use `Agent(instructions=...)` vs a capability that provides instructions? When should I register tools with `@agent.tool` vs bundling them in a capability?
+
+## 6. Ordering guidance for composition
+
+The composition section explains *mechanics* (before fires in order, after in reverse, wrap nests) but gives no *practical guidance*. If I have a guardrail and a logger, which should come first? What's the mental model for deciding order?
+
+## 7. How does `for_run` interact with `get_toolset`?
+
+`get_toolset` is called at agent construction, `for_run` is called per-run. If my capability returns a toolset *and* overrides `for_run`, does the toolset come from the original instance or the fresh one? This matters a lot for stateful toolsets.
+
+## 8. Concurrency / thread safety
+
+If the same agent is used for concurrent runs (common in web servers), what happens to shared capability instances? `for_run` helps, but the docs don't mention concurrency at all. Users need to know if `for_run` is required or just recommended for concurrent usage.
+
+## 9. Can capabilities interact with each other?
+
+No mention of inter-capability communication. If capability A needs to read state from capability B (e.g., a cost tracker that a budget limiter reads), how do they coordinate? Through dependencies? Through some other mechanism?
+
+## 10. Skip exceptions and hook chain behavior
+
+`SkipModelRequest`, `SkipToolValidation`, and `SkipToolExecution` are mentioned one sentence each. Users will want to know: when a skip is raised from `before_*`, do other capabilities' `before_*` hooks still fire? Do `after_*` hooks fire with the skip result? What about `wrap_*`?
+
+## 11. Error hook ordering with multiple capabilities
+
+When multiple capabilities define `on_*_error`, which fires first? If one recovers (returns a result), do the others still see the error? Or does recovery short-circuit?
+
+## 12. Message history interaction
+
+If `message_history` is passed to `agent.run()`, do `before_model_request` hooks see the historical messages in `request_context.messages`? Can they modify them?
+
+## 13. Testing custom capabilities
+
+No guidance on testing. Users building non-trivial capabilities (guardrails, approval workflows) will want patterns for: testing that hooks fire in the right order, testing error recovery, testing with `TestModel`, etc.
+
+## 14. Complete lifecycle flow diagram
+
+The error hook section has a small `before -> wrap -> after/on_error` ASCII diagram, but there's no full picture showing how all the hooks interact across a complete run (run -> node -> model request -> tool validate -> tool execute), especially with multiple capabilities composed.
+
+## Assessment
+
+Most of these are 'what happens when I combine X with Y' questions — the docs do a good job explaining each feature in isolation but leave composition, edge cases, and the full lifecycle under-documented. The biggest gaps are **#3** (undocumented builtins), **#1** (the type parameter), and **#5** (decision framework for when to use capabilities vs direct params).
diff --git a/NICHE-DOCS.md b/NICHE-DOCS.md
@@ -0,0 +1,49 @@
+# Capabilities — Niche Caveats Worth Documenting
+
+Brief notes on edge-case behaviors discovered during bug hunting (PR #4640).
+These don't warrant dedicated doc sections but should appear as caveats/notes
+near the relevant feature documentation once the bugs are fixed.
+
+## 1. DynamicToolset factory errors leave ambiguous state
+
+**Where to document:** `docs/toolsets.md`, near the dynamic toolset section.
+
+**Caveat:** If a `per_run_step=True` factory raises mid-run, the DynamicToolset
+doesn't roll back — the previous toolset stays referenced but the overall state
+is ambiguous. Worse: if the factory succeeds but the new toolset's `__aenter__`
+fails, the old toolset has already been exited and the new one never entered.
+`self._toolset` then points to an un-initialized toolset.
+
+**User guidance:** Factory functions used with `per_run_step=True` should be
+defensive — catch and handle their own errors rather than letting exceptions
+propagate into the toolset lifecycle. If the factory can fail, consider
+returning the previous toolset as a fallback instead of raising.
+
+## 2. Tool retry counts persist across DynamicToolset tool swaps
+
+**Where to document:** `docs/toolsets.md`, near dynamic toolsets or tool retries.
+
+**Caveat:** `ToolManager` tracks tool retries by **name** across run steps. If a
+`DynamicToolset` with `per_run_step=True` replaces a tool's implementation
+between steps but keeps the same name, the new implementation inherits the
+accumulated retry count from the old one. A fresh tool can be born with N
+retries already counted against it, hitting `max_retries` sooner than expected.
+
+**User guidance:** If tool implementations change between steps, use distinct
+tool names to avoid inheriting retry state. Alternatively, design tools with
+retry budgets that account for the possibility of inherited counts.
+
+## 3. History processor composition doesn't validate message consistency
+
+**Where to document:** `docs/message-history.md`, near the history processors section.
+
+**Caveat:** When multiple history processors are registered, they compose in
+sequence — processor 2 sees processor 1's output. There's no validation that
+the resulting messages are semantically consistent. A processor that removes
+`ModelResponse` messages containing `ToolCallPart` but leaves the subsequent
+`ModelRequest` with `ToolReturnPart` creates orphaned tool returns — the model
+sees a tool result with no preceding tool call.
+
+**User guidance:** When writing history processors that remove messages, ensure
+tool calls and their corresponding tool returns are removed together. A safe
+pattern is to track tool call IDs and remove both the call and return as a pair.
diff --git a/docs/capabilities.md b/docs/capabilities.md
@@ -45,7 +45,7 @@ agent = Agent(
 )
 ```
 
-[Instructions](agent.md#instructions) and [model settings](agent.md#model-run-settings) are configured directly via the `instructions` and `model_settings` parameters on `Agent` (or [`AgentSpec`][pydantic_ai.agent.spec.AgentSpec]). Capabilities are for behavior that goes beyond simple configuration — tools, lifecycle hooks, and custom extensions. They compose well, especially when you want to reuse the same configuration across multiple agents or load it from a [spec file](agent-spec.md).
+[Instructions](agent.md#instructions) and [model settings](agent.md#model-run-settings) are configured directly via the `instructions` and `model_settings` parameters on `Agent` (or [`AgentSpec`](agent-spec.md)). Capabilities are for behavior that goes beyond simple configuration — tools, lifecycle hooks, and custom extensions. They compose well, especially when you want to reuse the same configuration across multiple agents or load it from a [spec file](agent-spec.md).
 
 ### Thinking
 
@@ -353,11 +353,11 @@ The callable receives a [`RunContext`][pydantic_ai.tools.RunContext] where `ctx.
 
 | Method | Return type | Purpose |
 |---|---|---|
-| [`get_toolset()`][pydantic_ai.capabilities.AbstractCapability.get_toolset] | [`AgentToolset`][pydantic_ai.toolsets.AgentToolset] ` \| None` | A [toolset](toolsets.md) to register (or a callable for [dynamic toolsets](toolsets.md#dynamically-building-a-toolset)) |
+| [`get_toolset()`][pydantic_ai.capabilities.AbstractCapability.get_toolset] | `AgentToolset \| None` | A [toolset](toolsets.md) to register (or a callable for [dynamic toolsets](toolsets.md#dynamically-building-a-toolset)) |
 | [`get_builtin_tools()`][pydantic_ai.capabilities.AbstractCapability.get_builtin_tools] | `Sequence[`[`AgentBuiltinTool`][pydantic_ai.tools.AgentBuiltinTool]`]` | [Builtin tools](builtin-tools.md) to register (including callables) |
-| [`get_wrapper_toolset()`][pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset] | [`AbstractToolset`][pydantic_ai.toolsets.AbstractToolset] ` \| None` | [Wrap the agent's assembled toolset](#toolset-wrapping) |
-| [`get_instructions()`][pydantic_ai.capabilities.AbstractCapability.get_instructions] | [`AgentInstructions`][pydantic_ai._instructions.AgentInstructions] ` \| None` | [Instructions](agent.md#instructions) (static strings, [template strings](agent-spec.md#template-strings), or callables) |
-| [`get_model_settings()`][pydantic_ai.capabilities.AbstractCapability.get_model_settings] | [`AgentModelSettings`][pydantic_ai.agent.abstract.AgentModelSettings] ` \| None` | [Model settings](agent.md#model-run-settings) dict, or a callable for per-step settings |
+| [`get_wrapper_toolset()`][pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset] | [`AbstractToolset`][pydantic_ai.toolsets.AbstractToolset] `\| None` | [Wrap the agent's assembled toolset](#toolset-wrapping) |
+| [`get_instructions()`][pydantic_ai.capabilities.AbstractCapability.get_instructions] | `AgentInstructions \| None` | [Instructions](agent.md#instructions) (static strings, [template strings](agent-spec.md#template-strings), or callables) |
+| [`get_model_settings()`][pydantic_ai.capabilities.AbstractCapability.get_model_settings] | `AgentModelSettings \| None` | [Model settings](agent.md#model-run-settings) dict, or a callable for per-step settings |
 
 ### Hooking into the lifecycle
 
@@ -391,7 +391,7 @@ Capabilities can hook into five lifecycle points, each with up to four variants:
 | [`wrap_node_run`][pydantic_ai.capabilities.AbstractCapability.wrap_node_run] | `(ctx: RunContext, *, node: AgentNode, handler: WrapNodeRunHandler) -> NodeResult` | Wrap each graph node execution |
 | [`on_node_run_error`][pydantic_ai.capabilities.AbstractCapability.on_node_run_error] | `(ctx: RunContext, *, node: AgentNode, error: BaseException) -> NodeResult` | Handle node errors (see [error hooks](#error-hooks)) |
 
-[`wrap_node_run`][pydantic_ai.capabilities.AbstractCapability.wrap_node_run] fires for every node in the [agent graph](agent.md#iterating-over-an-agents-graph) ([`UserPromptNode`][pydantic_ai.UserPromptNode], [`ModelRequestNode`][pydantic_ai.ModelRequestNode], [`CallToolsNode`][pydantic_ai.CallToolsNode]). Override this to observe node transitions, add per-step logging, or modify graph progression:
+[`wrap_node_run`][pydantic_ai.capabilities.AbstractCapability.wrap_node_run] fires for every node in the [agent graph](agent.md#iterating-over-an-agents-graph) (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`). Override this to observe node transitions, add per-step logging, or modify graph progression:
 
 !!! note
     `wrap_node_run` hooks are called automatically by [`agent.run()`][pydantic_ai.agent.AbstractAgent.run], [`agent.run_stream()`][pydantic_ai.agent.AbstractAgent.run_stream], and [`agent_run.next()`][pydantic_ai.run.AgentRun.next]. However, they are **not** called when iterating with bare `async for node in agent_run:` over [`agent.iter()`][pydantic_ai.agent.Agent.iter], since that uses the graph run's internal iteration. Always use `agent_run.next(node)` to advance the run if you need `wrap_node_run` hooks to fire.
@@ -475,7 +475,7 @@ See [Iterating Over an Agent's Graph](agent.md#iterating-over-an-agents-graph) f
 | [`wrap_model_request`][pydantic_ai.capabilities.AbstractCapability.wrap_model_request] | `(ctx: RunContext, *, request_context: ModelRequestContext, handler: WrapModelRequestHandler) -> ModelResponse` | Wrap the model call |
 | [`on_model_request_error`][pydantic_ai.capabilities.AbstractCapability.on_model_request_error] | `(ctx: RunContext, *, request_context: ModelRequestContext, error: Exception) -> ModelResponse` | Handle model request errors (see [error hooks](#error-hooks)) |
 
-[`ModelRequestContext`][pydantic_ai.models.ModelRequestContext] bundles `messages`, `model_settings`, and `model_request_parameters` into a single object, making the signature future-proof.
+`ModelRequestContext` bundles `messages`, `model_settings`, and `model_request_parameters` into a single object, making the signature future-proof.
 
 To skip the model call entirely and provide a replacement response, raise [`SkipModelRequest(response)`][pydantic_ai.exceptions.SkipModelRequest] from `before_model_request` or `wrap_model_request`.
 
@@ -874,7 +874,7 @@ agent = Agent.from_spec(
 )
 ```
 
-Users register custom capability types via the `custom_capability_types` parameter on [`Agent.from_spec`][pydantic_ai.Agent.from_spec] or [`Agent.from_file`][pydantic_ai.Agent.from_file].
+Users register custom capability types via the `custom_capability_types` parameter on [`Agent.from_spec`][pydantic_ai.agent.Agent.from_spec] or [`Agent.from_file`][pydantic_ai.agent.Agent.from_file].
 
 Override [`from_spec`][pydantic_ai.capabilities.AbstractCapability.from_spec] when the constructor takes types that can't be represented in YAML/JSON. The spec fields should mirror the dataclass fields, but with serializable types:
 

diff --git a/docs/hooks.md b/docs/hooks.md
@@ -96,7 +96,7 @@ Run hooks fire once per agent run. `wrap_run` (registered via `hooks.on.run`) wr
 | `node_run` | `node_run=` | `wrap_node_run` |
 | `node_run_error` | `node_run_error=` | `on_node_run_error` |
 
-Node hooks fire for each graph step ([`UserPromptNode`][pydantic_ai.UserPromptNode], [`ModelRequestNode`][pydantic_ai.ModelRequestNode], [`CallToolsNode`][pydantic_ai.CallToolsNode]).
+Node hooks fire for each graph step (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`).
 
 !!! note
     `wrap_node_run` hooks are called automatically by [`agent.run()`][pydantic_ai.agent.AbstractAgent.run], [`agent.run_stream()`][pydantic_ai.agent.AbstractAgent.run_stream], and [`agent_run.next()`][pydantic_ai.run.AgentRun.next], but **not** when iterating with bare `async for node in agent_run:`.
@@ -110,7 +110,7 @@ Node hooks fire for each graph step ([`UserPromptNode`][pydantic_ai.UserPromptNo
 | `model_request` | `model_request=` | `wrap_model_request` |
 | `model_request_error` | `model_request_error=` | `on_model_request_error` |
 
-Model request hooks fire around each LLM call. [`ModelRequestContext`][pydantic_ai.models.ModelRequestContext] bundles `messages`, `model_settings`, and `model_request_parameters`.
+Model request hooks fire around each LLM call. `ModelRequestContext` bundles `messages`, `model_settings`, and `model_request_parameters`.
 
 To skip the model call entirely, raise [`SkipModelRequest(response)`][pydantic_ai.exceptions.SkipModelRequest] from `before_model_request` or `model_request` (wrap).