Skip to content

Fix capability schema generation to include full parameter types#4867

Merged
DouweM merged 17 commits intomainfrom
agentspec-capabilities-schema
Mar 27, 2026
Merged

Fix capability schema generation to include full parameter types#4867
DouweM merged 17 commits intomainfrom
agentspec-capabilities-schema

Conversation

@DouweM
Copy link
Copy Markdown
Collaborator

@DouweM DouweM commented Mar 26, 2026

Summary

  • Fix model_json_schema_with_capabilities() to generate full parameter schemas for all capabilities, not just string forms
  • When from_spec is not overridden, fall back to inspecting __init__ for schema generation
  • Add from_spec overrides to WebSearch, WebFetch, MCP, and ImageGeneration exposing only serializable spec-relevant params
  • Add self/cls parameter skipping in build_schema_types to support __init__ as schema target
  • Replace existing schema test with full snapshot() assertion

Before: Thinking, WebSearch, WebFetch, MCP, ImageGeneration only appeared as string constants in the schema (e.g. "Thinking")
After: Full parameter schemas are generated matching the documented spec syntax:

  • Thinking: high / {Thinking: {effort: high}}
  • WebSearch: {search_context_size: high, blocked_domains: [...]}
  • MCP: https://example.com / {MCP: {url: ..., id: ..., allowed_tools: [...]}}
  • etc.

Test plan

  • All 277 test_capabilities.py tests pass
  • Snapshot test captures full schema output for regression detection
  • Pre-commit hooks pass (format, lint, typecheck)

🤖 Generated with Claude Code

Previously, `model_json_schema_with_capabilities()` only generated string
forms for capabilities like Thinking, WebSearch, WebFetch, MCP, and
ImageGeneration because `build_schema_types` inspected the base
`AbstractCapability.from_spec(*args, **kwargs)` signature and saw no
typed parameters.

Fix by:
- Falling back to `__init__` when `from_spec` is not overridden
- Adding `self`/`cls` parameter skipping in `build_schema_types`
- Adding explicit `from_spec` overrides to capabilities with
  non-serializable `__init__` params (WebSearch, WebFetch, MCP,
  ImageGeneration) that expose only spec-relevant fields

The schema now supports all documented spec syntax forms:
- `Thinking: high` (short form with effort level)
- `WebSearch: {search_context_size: high}` (long form with params)
- `MCP: https://example.com` (short form with URL)
- etc.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added size: M Medium PR (101-500 weighted lines) bug Report that something isn't working, or PR implementing a fix labels Mar 26, 2026
return None

@classmethod
def from_spec(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: In the other capability classes (ImageGeneration, MCP, WebFetch), from_spec is placed right after __init__. Here it's placed after _default_local, near the end of the class. Consider moving it to right after __init__ (after the __post_init__() call) for consistency.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 26, 2026

Docs Preview

commit: ff2d812
Preview URL: https://be22371b-pydantic-ai-previews.pydantic.workers.dev

DouweM and others added 4 commits March 26, 2026 22:42
…overage

Address review comment about from_spec placement consistency.
Add tests that load ImageGeneration, WebFetch, and MCP from specs
to cover the new from_spec methods.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of duplicating __init__ params in from_spec overrides, add
filter_serializable_type to strip TypeVars and Callables from union
types automatically. This means:

- WebSearch, WebFetch, ImageGeneration no longer need from_spec
  overrides — their schemas are derived directly from __init__
- builtin params now correctly show the specific tool type (e.g.
  ImageGenerationTool | bool) instead of just bool
- MCP still needs from_spec due to TYPE_CHECKING-only imports
  preventing hint resolution, but now accepts MCPServerTool | bool

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of duplicating __init__ params in from_spec overrides, add
filter_serializable_type to strip TypeVars and Callables from union
types automatically. This means:

- WebSearch, WebFetch, ImageGeneration no longer need from_spec
  overrides — their schemas are derived directly from __init__
- builtin params now correctly show the specific tool type (e.g.
  ImageGenerationTool | bool) instead of just bool
- MCP still needs from_spec due to TYPE_CHECKING-only imports
  preventing hint resolution, but now accepts MCPServerTool | bool
- Add test for schema filtering with custom capability having
  mixed serializable/non-serializable params

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

def _build_capability_schema_types(registry: Mapping[str, type[Any]]) -> list[Any]:
"""Build a list of schema types for capabilities from a registry."""
from pydantic_ai._utils import get_function_type_hints
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move imports to the top; this one and the one below


def _filter_serializable_type(tp: object) -> object | None:
import types
import typing
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports at the top please


max_retries: int
on_error: Any # Callable — not serializable, will be filtered from schema
verbose: Any # Callable | bool in __init__ — only bool survives in schema
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this actually have the callable type, and we use the dataclass's autogenerated init method? Or do we need a custom init?

{'$ref': '#/$defs/ImageGenerationTool'},
{
'oneOf': [
{'$ref': '#/$defs/WebSearchTool'},
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ImageGeneration capability should only accept ImageGenerationTool for builtin (and bool), not all the other builtin tool types. Same for the other capabilities for specific builtin tool types -- they should only support the intended one.

},
'$schema': {'type': 'string'},
},
'title': '_AgentSpecSchema',
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this title 'AgentSpec'?

…emoval

- Replace `AgentBuiltinTool[AgentDepsT]` with specific callable signatures
  in WebSearch, WebFetch, ImageGeneration, MCP __init__ types
  (e.g. `ImageGenerationTool | Callable[[RunContext[AgentDepsT]], ...]`)
  so schema shows only the intended tool type + bool, not all builtins
- Remove superclass dedup logic (no longer needed with precise types)
- Move imports to top level in _spec.py and agent/spec.py
- Simplify test capability to use dataclass auto-generated init
- Set schema title to 'AgentSpec'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
try:
get_function_type_hints(cls.__init__)
return cls.__init__
except Exception:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare except Exception is too broad here. The intent is to catch failures from unresolvable type hints (e.g. TYPE_CHECKING-guarded imports), which would raise NameError. Consider catching (NameError, TypeError) or at most (NameError, TypeError, AttributeError) to avoid silently swallowing unexpected errors like RuntimeError or KeyboardInterrupt (the latter is a BaseException, but the principle of catching specific exceptions still applies).

if origin is typing.Union or isinstance(tp, types.UnionType):
args = typing.get_args(tp)
filtered = [fa for a in args if (fa := filter_serializable_type(a)) is not None]
if not filtered: # pragma: no cover — requires union of only non-serializable types
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is testable: a Union[Callable[..., Any], TypeVar('T')] (or any union composed entirely of non-serializable types) would hit this branch. A small test for this edge case would be more reliable than a pragma and consistent with the test added for CapabilityWithCallbackParam.


# Other generics (list[X], dict[X, Y]): all args must be serializable
args = typing.get_args(tp)
if args and not all(filter_serializable_type(a) is not None for a in args):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: this double-negation (not all(... is not None ...)) is a bit hard to parse. any(filter_serializable_type(a) is None for a in args) reads more directly ("if any arg is non-serializable, reject the whole generic").

…union

- Narrow except to (NameError, TypeError, AttributeError) instead
  of bare Exception in __init__ hint resolution fallback
- Simplify double negation: `any(... is None ...)` reads clearer
- Remove pragma:no-cover on empty-union branch; add test via
  hooks: Callable | Callable union that filters to empty

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

import inspect
import types # used at runtime in filter_serializable_type
import typing # used at runtime in filter_serializable_type
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need these comments

registry: Mapping[str, type[Any]],
*,
get_schema_target: Callable[[type[Any]], Any] | None = None,
filter_type_hint: Callable[[Any], Any | None] | None = None,
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need an arg? I think we could just always do this -- we know that if it's not serializable, we'd fail here right?

self.__post_init__()

@classmethod
def from_spec(
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this method? Now that we handle unserializable type hint scorrectly

- Remove filter_type_hint parameter from build_schema_types; always
  apply filter_serializable_type (non-serializable types would fail
  schema generation anyway)
- Remove from_spec from MCP; use try/except import fallback for
  MCPServer/FastMCPToolset so __init__ hints resolve on slim installs
- Remove unnecessary import comments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 12 additional findings in Devin Review.

Open in Devin Review

Comment on lines +331 to +341
def _get_schema_target(cls: type[Any]) -> Any:
# When from_spec is not overridden, it delegates to cls(*args, **kwargs).
# Use __init__ directly so build_schema_types sees the actual parameter types.
# Fall back to from_spec if __init__ hints can't be resolved (e.g. TYPE_CHECKING imports).
if 'from_spec' not in cls.__dict__:
try:
get_function_type_hints(cls.__init__)
return cls.__init__
except (NameError, TypeError, AttributeError):
pass
return cls.from_spec
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Schema precision varies by Python version and MCP installation status

The combination of _get_schema_target fallback logic (pydantic_ai_slim/pydantic_ai/agent/spec.py:331-341) and the MCP try/except import (pydantic_ai_slim/pydantic_ai/capabilities/mcp.py:15-21) means the generated JSON schema for MCP capabilities will differ depending on the environment:

  • MCP installed (any Python): Full schema with all typed parameters from __init__
  • MCP not installed, Python 3.11+: Schema with Any-typed MCP-specific params (degraded but functional)
  • MCP not installed, Python 3.10: Falls back to from_spec → only literal name form in schema (minimal)

This is by design but worth documenting, as users generating schemas in different environments could get inconsistent results.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

DouweM and others added 3 commits March 27, 2026 01:11
…ilities

- Create CapabilitySpec as a subclass of NamedSpec in _spec.py
  (distinguishable from other NamedSpec uses like EvaluatorSpec)
- Type PrefixTools.from_spec capability param as CapabilitySpec
- In model_json_schema_with_capabilities, replace CapabilitySpec $refs
  with the full capability items Union, so nested capability fields
  show the same rich schema as the top-level capabilities array
- Update load_capability_from_nested_spec to accept CapabilitySpec

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The schema varies on slim installs where MCPServer/FastMCPToolset
resolve to Any instead of their actual types.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The schema now shows all known ModelSettings fields (max_tokens,
temperature, thinking, etc.) with proper types for editor
autocompletion, while still allowing additional provider-specific
properties.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added size: L Large PR (501-1500 weighted lines) and removed size: M Medium PR (101-500 weighted lines) labels Mar 27, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

DouweM and others added 2 commits March 27, 2026 02:41
coolname (prefect dependency) uses codecs.open() which is deprecated
in Python 3.14. Combined with filterwarnings = ["error"], this causes
test_prefect.py collection to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_SerializedNamedSpec.to_named_spec() always returned NamedSpec, so the
deserialize validator's fallback path produced NamedSpec instances even
when validating into a CapabilitySpec field. Pass the target cls to
to_named_spec() so subclasses are preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@DouweM DouweM enabled auto-merge (squash) March 27, 2026 02:51
DouweM and others added 3 commits March 27, 2026 02:53
The except ImportError branch only runs when mcp is not installed;
coverage runs with all extras.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Export AgentSpec and ModelRequestContext from pydantic_ai
- Fix all deep imports in docs to use shortest paths:
  - from pydantic_ai import AgentSpec/ModelSettings/RunContext/
    ToolDefinition/AgentStreamEvent/ModelRequestContext
  - from pydantic_ai.capabilities import Hooks/HookTimeoutError
- Fix pragma: no cover → pragma: no branch on mcp optional import
  (the lines are covered on slim, only the branch is not)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The except block is covered on slim runs (no mcp) but not on full
runs. pragma: lax no cover is the established pattern for lines
that are conditionally covered across different CI configurations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@DouweM DouweM merged commit 67b9726 into main Mar 27, 2026
68 of 71 checks passed
@DouweM DouweM deleted the agentspec-capabilities-schema branch March 27, 2026 03:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Report that something isn't working, or PR implementing a fix size: L Large PR (501-1500 weighted lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant