Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions DOCS-QUESTIONS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Capabilities Docs — Uncovered User Questions

Questions users are likely to have that the current `docs/capabilities.md` doesn't cover (or barely touches).

## 1. What's the generic type parameter on `AbstractCapability[T]`?

Every example uses `AbstractCapability[Any]`. Users will wonder: what is `T`? How does it relate to the agent's dependency type? Can a capability declare that it needs specific dependencies, and what happens if the agent's dep type doesn't match?
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we should make it clear it's generic in the agent's dependency type. AbstractToolset is as well, not sure if it's explicitly documented there.


## 2. Can capabilities be added per-run?

The docs only show capabilities at `Agent(...)` construction time. Users will want to know: can I pass extra capabilities to `agent.run()` or `agent.iter()`? Or override/disable specific ones for a single run?
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, capabilities can't currently be passed into agent.run or agent.override directly, although it can be done indirectly via spec=AgentSpec(capabilities=...)). In #4807 I started working on letting them be overridden/disabled one by one, and I can directly support capabilities=... on run and override there as well.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super, leaving the comment open as a reminder but not doing anything about it here


## 3. `BuiltinTool`, `Toolset`, and `HistoryProcessor` — how to use them?

These three are listed in the built-in table but have zero documentation beyond the one-line row. A user reading top-to-bottom will see `PrepareTools` and `PrefixTools` get dedicated sections but these don't.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we should show examples, although the API docs should make it pretty clear how to use them.

In #4828 I'm also making the BuiltinOrLocalTool capability public.


## 4. `ImageGeneration` local fallback

`WebSearch` auto-detects DuckDuckGo, `MCP` auto-detects transport — but what does `ImageGeneration(local=...)` expect? There's no example of providing a local image generation fallback.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any user tool that the agent should use instead of builtin image generation :) That could be a tool that directly uses an image generation model. Would be worth mentioning explicitly.


## 5. When to use a capability vs direct agent parameters

The intro paragraph hints at this ('instructions and model settings are configured directly...'), but there's no decision framework. When should I use `Agent(instructions=...)` vs a capability that provides instructions? When should I register tools with `@agent.tool` vs bundling them in a capability?

## 6. Ordering guidance for composition

The composition section explains *mechanics* (before fires in order, after in reverse, wrap nests) but gives no *practical guidance*. If I have a guardrail and a logger, which should come first? What's the mental model for deciding order?

## 7. How does `for_run` interact with `get_toolset`?

`get_toolset` is called at agent construction, `for_run` is called per-run. If my capability returns a toolset *and* overrides `for_run`, does the toolset come from the original instance or the fresh one? This matters a lot for stateful toolsets.

## 8. Concurrency / thread safety

If the same agent is used for concurrent runs (common in web servers), what happens to shared capability instances? `for_run` helps, but the docs don't mention concurrency at all. Users need to know if `for_run` is required or just recommended for concurrent usage.

## 9. Can capabilities interact with each other?

No mention of inter-capability communication. If capability A needs to read state from capability B (e.g., a cost tracker that a budget limiter reads), how do they coordinate? Through dependencies? Through some other mechanism?

## 10. Skip exceptions and hook chain behavior

`SkipModelRequest`, `SkipToolValidation`, and `SkipToolExecution` are mentioned one sentence each. Users will want to know: when a skip is raised from `before_*`, do other capabilities' `before_*` hooks still fire? Do `after_*` hooks fire with the skip result? What about `wrap_*`?
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, should be mentioned explicitly. Claude can probably find out (from code or testing it out) what the current behavior is :)


## 11. Error hook ordering with multiple capabilities

When multiple capabilities define `on_*_error`, which fires first? If one recovers (returns a result), do the others still see the error? Or does recovery short-circuit?

## 12. Message history interaction

If `message_history` is passed to `agent.run()`, do `before_model_request` hooks see the historical messages in `request_context.messages`? Can they modify them?

## 13. Testing custom capabilities

No guidance on testing. Users building non-trivial capabilities (guardrails, approval workflows) will want patterns for: testing that hooks fire in the right order, testing error recovery, testing with `TestModel`, etc.

## 14. Complete lifecycle flow diagram

The error hook section has a small `before -> wrap -> after/on_error` ASCII diagram, but there's no full picture showing how all the hooks interact across a complete run (run -> node -> model request -> tool validate -> tool execute), especially with multiple capabilities composed.

## Assessment

Most of these are 'what happens when I combine X with Y' questions — the docs do a good job explaining each feature in isolation but leave composition, edge cases, and the full lifecycle under-documented. The biggest gaps are **#3** (undocumented builtins), **#1** (the type parameter), and **#5** (decision framework for when to use capabilities vs direct params).
49 changes: 49 additions & 0 deletions NICHE-DOCS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Capabilities — Niche Caveats Worth Documenting

Brief notes on edge-case behaviors discovered during bug hunting (PR #4640).
These don't warrant dedicated doc sections but should appear as caveats/notes
near the relevant feature documentation once the bugs are fixed.

## 1. DynamicToolset factory errors leave ambiguous state

**Where to document:** `docs/toolsets.md`, near the dynamic toolset section.

**Caveat:** If a `per_run_step=True` factory raises mid-run, the DynamicToolset
doesn't roll back — the previous toolset stays referenced but the overall state
is ambiguous. Worse: if the factory succeeds but the new toolset's `__aenter__`
fails, the old toolset has already been exited and the new one never entered.
`self._toolset` then points to an un-initialized toolset.

**User guidance:** Factory functions used with `per_run_step=True` should be
defensive — catch and handle their own errors rather than letting exceptions
propagate into the toolset lifecycle. If the factory can fail, consider
returning the previous toolset as a fallback instead of raising.

## 2. Tool retry counts persist across DynamicToolset tool swaps

**Where to document:** `docs/toolsets.md`, near dynamic toolsets or tool retries.

**Caveat:** `ToolManager` tracks tool retries by **name** across run steps. If a
`DynamicToolset` with `per_run_step=True` replaces a tool's implementation
between steps but keeps the same name, the new implementation inherits the
accumulated retry count from the old one. A fresh tool can be born with N
retries already counted against it, hitting `max_retries` sooner than expected.

**User guidance:** If tool implementations change between steps, use distinct
tool names to avoid inheriting retry state. Alternatively, design tools with
retry budgets that account for the possibility of inherited counts.

## 3. History processor composition doesn't validate message consistency

**Where to document:** `docs/message-history.md`, near the history processors section.

**Caveat:** When multiple history processors are registered, they compose in
sequence — processor 2 sees processor 1's output. There's no validation that
the resulting messages are semantically consistent. A processor that removes
`ModelResponse` messages containing `ToolCallPart` but leaves the subsequent
`ModelRequest` with `ToolReturnPart` creates orphaned tool returns — the model
sees a tool result with no preceding tool call.

**User guidance:** When writing history processors that remove messages, ensure
tool calls and their corresponding tool returns are removed together. A safe
pattern is to track tool call IDs and remove both the call and return as a pair.
16 changes: 8 additions & 8 deletions docs/capabilities.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ agent = Agent(
)
```

[Instructions](agent.md#instructions) and [model settings](agent.md#model-run-settings) are configured directly via the `instructions` and `model_settings` parameters on `Agent` (or [`AgentSpec`][pydantic_ai.agent.spec.AgentSpec]). Capabilities are for behavior that goes beyond simple configuration — tools, lifecycle hooks, and custom extensions. They compose well, especially when you want to reuse the same configuration across multiple agents or load it from a [spec file](agent-spec.md).
[Instructions](agent.md#instructions) and [model settings](agent.md#model-run-settings) are configured directly via the `instructions` and `model_settings` parameters on `Agent` (or [`AgentSpec`](agent-spec.md)). Capabilities are for behavior that goes beyond simple configuration — tools, lifecycle hooks, and custom extensions. They compose well, especially when you want to reuse the same configuration across multiple agents or load it from a [spec file](agent-spec.md).

### Thinking

Expand Down Expand Up @@ -353,11 +353,11 @@ The callable receives a [`RunContext`][pydantic_ai.tools.RunContext] where `ctx.

| Method | Return type | Purpose |
|---|---|---|
| [`get_toolset()`][pydantic_ai.capabilities.AbstractCapability.get_toolset] | [`AgentToolset`][pydantic_ai.toolsets.AgentToolset] ` \| None` | A [toolset](toolsets.md) to register (or a callable for [dynamic toolsets](toolsets.md#dynamically-building-a-toolset)) |
| [`get_toolset()`][pydantic_ai.capabilities.AbstractCapability.get_toolset] | `AgentToolset \| None` | A [toolset](toolsets.md) to register (or a callable for [dynamic toolsets](toolsets.md#dynamically-building-a-toolset)) |
| [`get_builtin_tools()`][pydantic_ai.capabilities.AbstractCapability.get_builtin_tools] | `Sequence[`[`AgentBuiltinTool`][pydantic_ai.tools.AgentBuiltinTool]`]` | [Builtin tools](builtin-tools.md) to register (including callables) |
| [`get_wrapper_toolset()`][pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset] | [`AbstractToolset`][pydantic_ai.toolsets.AbstractToolset] ` \| None` | [Wrap the agent's assembled toolset](#toolset-wrapping) |
| [`get_instructions()`][pydantic_ai.capabilities.AbstractCapability.get_instructions] | [`AgentInstructions`][pydantic_ai._instructions.AgentInstructions] ` \| None` | [Instructions](agent.md#instructions) (static strings, [template strings](agent-spec.md#template-strings), or callables) |
| [`get_model_settings()`][pydantic_ai.capabilities.AbstractCapability.get_model_settings] | [`AgentModelSettings`][pydantic_ai.agent.abstract.AgentModelSettings] ` \| None` | [Model settings](agent.md#model-run-settings) dict, or a callable for per-step settings |
| [`get_wrapper_toolset()`][pydantic_ai.capabilities.AbstractCapability.get_wrapper_toolset] | [`AbstractToolset`][pydantic_ai.toolsets.AbstractToolset] `\| None` | [Wrap the agent's assembled toolset](#toolset-wrapping) |
| [`get_instructions()`][pydantic_ai.capabilities.AbstractCapability.get_instructions] | `AgentInstructions \| None` | [Instructions](agent.md#instructions) (static strings, [template strings](agent-spec.md#template-strings), or callables) |
| [`get_model_settings()`][pydantic_ai.capabilities.AbstractCapability.get_model_settings] | `AgentModelSettings \| None` | [Model settings](agent.md#model-run-settings) dict, or a callable for per-step settings |

### Hooking into the lifecycle

Expand Down Expand Up @@ -391,7 +391,7 @@ Capabilities can hook into five lifecycle points, each with up to four variants:
| [`wrap_node_run`][pydantic_ai.capabilities.AbstractCapability.wrap_node_run] | `(ctx: RunContext, *, node: AgentNode, handler: WrapNodeRunHandler) -> NodeResult` | Wrap each graph node execution |
| [`on_node_run_error`][pydantic_ai.capabilities.AbstractCapability.on_node_run_error] | `(ctx: RunContext, *, node: AgentNode, error: BaseException) -> NodeResult` | Handle node errors (see [error hooks](#error-hooks)) |

[`wrap_node_run`][pydantic_ai.capabilities.AbstractCapability.wrap_node_run] fires for every node in the [agent graph](agent.md#iterating-over-an-agents-graph) ([`UserPromptNode`][pydantic_ai.UserPromptNode], [`ModelRequestNode`][pydantic_ai.ModelRequestNode], [`CallToolsNode`][pydantic_ai.CallToolsNode]). Override this to observe node transitions, add per-step logging, or modify graph progression:
[`wrap_node_run`][pydantic_ai.capabilities.AbstractCapability.wrap_node_run] fires for every node in the [agent graph](agent.md#iterating-over-an-agents-graph) (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`). Override this to observe node transitions, add per-step logging, or modify graph progression:

!!! note
`wrap_node_run` hooks are called automatically by [`agent.run()`][pydantic_ai.agent.AbstractAgent.run], [`agent.run_stream()`][pydantic_ai.agent.AbstractAgent.run_stream], and [`agent_run.next()`][pydantic_ai.run.AgentRun.next]. However, they are **not** called when iterating with bare `async for node in agent_run:` over [`agent.iter()`][pydantic_ai.agent.Agent.iter], since that uses the graph run's internal iteration. Always use `agent_run.next(node)` to advance the run if you need `wrap_node_run` hooks to fire.
Expand Down Expand Up @@ -475,7 +475,7 @@ See [Iterating Over an Agent's Graph](agent.md#iterating-over-an-agents-graph) f
| [`wrap_model_request`][pydantic_ai.capabilities.AbstractCapability.wrap_model_request] | `(ctx: RunContext, *, request_context: ModelRequestContext, handler: WrapModelRequestHandler) -> ModelResponse` | Wrap the model call |
| [`on_model_request_error`][pydantic_ai.capabilities.AbstractCapability.on_model_request_error] | `(ctx: RunContext, *, request_context: ModelRequestContext, error: Exception) -> ModelResponse` | Handle model request errors (see [error hooks](#error-hooks)) |

[`ModelRequestContext`][pydantic_ai.models.ModelRequestContext] bundles `messages`, `model_settings`, and `model_request_parameters` into a single object, making the signature future-proof.
`ModelRequestContext` bundles `messages`, `model_settings`, and `model_request_parameters` into a single object, making the signature future-proof.

To skip the model call entirely and provide a replacement response, raise [`SkipModelRequest(response)`][pydantic_ai.exceptions.SkipModelRequest] from `before_model_request` or `wrap_model_request`.

Expand Down Expand Up @@ -874,7 +874,7 @@ agent = Agent.from_spec(
)
```

Users register custom capability types via the `custom_capability_types` parameter on [`Agent.from_spec`][pydantic_ai.Agent.from_spec] or [`Agent.from_file`][pydantic_ai.Agent.from_file].
Users register custom capability types via the `custom_capability_types` parameter on [`Agent.from_spec`][pydantic_ai.agent.Agent.from_spec] or [`Agent.from_file`][pydantic_ai.agent.Agent.from_file].

Override [`from_spec`][pydantic_ai.capabilities.AbstractCapability.from_spec] when the constructor takes types that can't be represented in YAML/JSON. The spec fields should mirror the dataclass fields, but with serializable types:

Expand Down
4 changes: 2 additions & 2 deletions docs/hooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ Run hooks fire once per agent run. `wrap_run` (registered via `hooks.on.run`) wr
| `node_run` | `node_run=` | `wrap_node_run` |
| `node_run_error` | `node_run_error=` | `on_node_run_error` |

Node hooks fire for each graph step ([`UserPromptNode`][pydantic_ai.UserPromptNode], [`ModelRequestNode`][pydantic_ai.ModelRequestNode], [`CallToolsNode`][pydantic_ai.CallToolsNode]).
Node hooks fire for each graph step (`UserPromptNode`, `ModelRequestNode`, `CallToolsNode`).

!!! note
`wrap_node_run` hooks are called automatically by [`agent.run()`][pydantic_ai.agent.AbstractAgent.run], [`agent.run_stream()`][pydantic_ai.agent.AbstractAgent.run_stream], and [`agent_run.next()`][pydantic_ai.run.AgentRun.next], but **not** when iterating with bare `async for node in agent_run:`.
Expand All @@ -110,7 +110,7 @@ Node hooks fire for each graph step ([`UserPromptNode`][pydantic_ai.UserPromptNo
| `model_request` | `model_request=` | `wrap_model_request` |
| `model_request_error` | `model_request_error=` | `on_model_request_error` |

Model request hooks fire around each LLM call. [`ModelRequestContext`][pydantic_ai.models.ModelRequestContext] bundles `messages`, `model_settings`, and `model_request_parameters`.
Model request hooks fire around each LLM call. `ModelRequestContext` bundles `messages`, `model_settings`, and `model_request_parameters`.

To skip the model call entirely, raise [`SkipModelRequest(response)`][pydantic_ai.exceptions.SkipModelRequest] from `before_model_request` or `model_request` (wrap).

Expand Down
Loading
Loading