From 256aa2f8fb9d9231bf4e8baa1db8c512708ee436 Mon Sep 17 00:00:00 2001 From: hetzner Date: Wed, 25 Mar 2026 21:36:49 +0000 Subject: [PATCH 1/3] plan: Google structured output + tool combination for Gemini 3 Covers #4801 (NativeOutput + function tools) and #4788 (function + builtin tools). Co-Authored-By: Claude Opus 4.6 (1M context) --- PLAN.md | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 192 insertions(+) create mode 100644 PLAN.md diff --git a/PLAN.md b/PLAN.md new file mode 100644 index 0000000000..c349f0da8f --- /dev/null +++ b/PLAN.md @@ -0,0 +1,192 @@ +# Plan: Google Structured Output + Tool Combination (Gemini 3) + +## Context + +**Issues**: [#4801](https://github.com/pydantic/pydantic-ai/issues/4801) (NativeOutput + function tools), [#4788](https://github.com/pydantic/pydantic-ai/issues/4788) (function + builtin tools) +**Linked by**: DouweM (maintainer) + +Google Gemini 3 supports [structured outputs with all tools](https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured_outputs_with_tools) including function calling, and [combining builtin tools with function calling](https://ai.google.dev/gemini-api/docs/tool-combination). Currently pydantic-ai blocks both combinations unconditionally. + +**Comparison with other providers**: +- **OpenAI**: Allows NativeOutput + function tools. No restrictions. +- **Anthropic**: Allows NativeOutput + function tools (restricts only when thinking + output tools). +- **Google**: Currently blocks both combinations for all models. Needs version-gating for Gemini 3. + +## Implementation + +### 1. Profile: add two capability flags + +**File**: `pydantic_ai_slim/pydantic_ai/profiles/google.py` + +Add to `GoogleModelProfile`: +```python +google_supports_native_output_with_function_tools: bool = False +google_supports_function_tools_with_builtin_tools: bool = False +``` + +In `google_model_profile()`, set both to `is_3_or_newer`. + +Two separate flags (vs one combined) follows the existing naming convention (`google_supports_native_output_with_builtin_tools`) and allows independent control. + +### 2. Model: gate restrictions on profile flags + +**File**: `pydantic_ai_slim/pydantic_ai/models/google.py` + +#### 2a. `_get_tools` (line 444-446) - function tools + builtin tools + +Current: unconditional `raise UserError(...)` when both present. +Change: read `google_supports_function_tools_with_builtin_tools` from profile; only error if False. + +```python +if model_request_parameters.builtin_tools: + if model_request_parameters.function_tools: + if not GoogleModelProfile.from_profile(self.profile).google_supports_function_tools_with_builtin_tools: + raise UserError('Google does not support function tools and built-in tools at the same time.') +``` + +When the flag is True, the code proceeds to append builtin tool dicts to the same `tools` list that already has function tool dicts. Both types become separate `ToolDict` entries in the list. + +#### 2b. `_build_content_and_config` (line 537-541) - NativeOutput + function tools + +Current: unconditional `raise UserError(...)` when `output_mode == 'native'` and function_tools present. +Change: read `google_supports_native_output_with_function_tools` from profile; only error if False. + +```python +if model_request_parameters.output_mode == 'native': + if model_request_parameters.function_tools: + if not GoogleModelProfile.from_profile(self.profile).google_supports_native_output_with_function_tools: + raise UserError( + 'Google does not support `NativeOutput` and function tools at the same time. Use `output_type=ToolOutput(...)` instead.' + ) + response_mime_type = 'application/json' + ... +``` + +When allowed, `response_mime_type` + `response_json_schema` are set alongside the function tool declarations. The model can call tools when needed and must return structured JSON conforming to the schema for its final text response. + +#### 2c. No changes to `prepare_request` + +The existing `prepare_request` (line 286-301) handles `output_tools + builtin_tools` using `google_supports_native_output_with_builtin_tools`. This already correctly resolves auto mode to 'native' for Gemini 3 when builtin_tools + output_tools are present. + +For auto mode + function_tools + builtin_tools on Gemini 3: +1. `prepare_request`: output_tools + builtin_tools → auto converts to 'native' (existing logic) +2. Base class: clears output_tools, populates output_object +3. `_get_tools`: function_tools + builtin_tools → now allowed (fix 2a) +4. `_build_content_and_config`: native + function_tools → now allowed (fix 2b) + +Auto mode WITHOUT builtin_tools resolves to 'tool' (via `default_structured_output_mode='tool'`), so output_tools handle structured output. No change needed. + +### 3. `_get_tool_config` - no changes needed + +When `output_mode='native'`, `allow_text_output=True` (set by base class). `_get_tool_config` returns None (no forced tool calling). The model freely chooses between calling function tools and returning structured text. Correct behavior per Google docs. + +## Edge cases + +| Scenario | Gemini 2 | Gemini 3 | +|---|---|---| +| NativeOutput (no tools) | works | works | +| NativeOutput + builtin_tools | error | works (existing) | +| NativeOutput + function_tools | error | **works (new)** | +| NativeOutput + function + builtin | error | **works (new)** | +| function + builtin (no output type) | error | **works (new)** | +| auto + function + builtin + output_type | error | **works (new, auto->native)** | +| ToolOutput + builtin_tools | error | error (correct, use NativeOutput) | +| ToolOutput + function_tools | works | works (standard tool mode) | + +### 4. SDK investigation: `include_server_side_tool_invocations` + +The [tool combination docs](https://ai.google.dev/gemini-api/docs/tool-combination) say `include_server_side_tool_invocations=True` is required when combining function tools + builtin tools. + +**SDK support**: +- `include_server_side_tool_invocations` was added to `GenerateContentConfig` in **google-genai 1.68.0** ([release](https://github.com/googleapis/python-genai/releases/tag/v1.68.0)) +- Current minimum pin: `>=1.56.0` (in `pydantic_ai_slim/pyproject.toml`) +- Current installed: 1.56.0 + +**Already handled by pydantic-ai** (no changes needed): +- `thought_signature` on `Part` — exists in SDK 1.56.0, pydantic-ai already round-trips it via `provider_details` (`google.py:984-990`, `google.py:1161-1179`) +- `FunctionCall.id` / `FunctionResponse.id` — exists in SDK, already mapped (`google.py:1016`, `google.py:1255-1256`) + +**Approach**: +1. For **#4801** (NativeOutput + function tools): no SDK changes needed. Just lift the restriction. This doesn't involve builtin tools, so `include_server_side_tool_invocations` is irrelevant. +2. For **#4788** (function + builtin tools): test against the live API first WITHOUT setting `include_server_side_tool_invocations`: + - Existing builtin tool tests (Google Search, Code Execution) already work without this flag + - If the API rejects the combination without the flag, add it conditionally: + ```python + # In _build_content_and_config, when both tool types present: + if has_function_tools and has_builtin_tools: + config['include_server_side_tool_invocations'] = True + ``` + Since `GenerateContentConfigDict` is a TypedDict (dict at runtime), this key is silently passed through. On SDK >=1.68.0 it's recognized; on older versions the SDK ignores unknown dict keys when constructing the protobuf. + - If that also fails, we may need to bump the minimum SDK version to `>=1.68.0` for the tool combination feature — but this would be a last resort since it forces upgrades on all users. + +**No minimum version bump** unless API testing proves it necessary. Per AGENTS.md: "Verify provider limitations through testing before implementing workarounds." + +## Tests + +### New file: `tests/models/google/test_structured_output.py` + +Following `tests/models/anthropic/` pattern with conftest + VCR cassettes. + +#### Structure +``` +tests/models/google/ + __init__.py + conftest.py # google_model factory fixture + test_structured_output.py + cassettes/ + test_structured_output/ # VCR cassettes +``` + +#### conftest.py +- `google_model` factory: creates `GoogleModel(model_name, provider=GoogleProvider(api_key=...))` (mirrors `tests/models/anthropic/conftest.py`) +- Reuse `gemini_api_key` from `tests/conftest.py`, `google_provider` from `tests/models/test_google.py` + +#### Test cases + +**VCR integration tests** (record against live API): + +1. `test_native_output_with_function_tools` - Gemini 3 + NativeOutput(CityLocation) + function tool that returns data → assert structured output + all_messages snapshot +2. `test_native_output_with_function_tools_stream` - same as above, streaming +3. `test_function_tools_with_builtin_tools` - Gemini 3 + function tool + WebSearchTool → assert response + messages +4. `test_native_output_with_function_and_builtin_tools` - Gemini 3 + NativeOutput + function tool + WebSearchTool → full combo +5. `test_native_output_with_builtin_tools` - Gemini 3 + NativeOutput + WebSearchTool (existing behavior, move from test_google.py) +6. `test_auto_mode_with_function_and_builtin_tools` - Gemini 3 + output_type=SomeModel + function tool + WebSearchTool → verify auto resolves to native + +**Error tests** (no VCR needed, error before API call): + +7. `test_native_output_with_function_tools_unsupported` - Gemini 2 + NativeOutput + function tool → UserError +8. `test_function_tools_with_builtin_tools_unsupported` - Gemini 2 + function + builtin → UserError +9. `test_tool_output_with_builtin_tools_error` - Gemini 3 + ToolOutput + builtin → UserError (correct - use NativeOutput) + +### Tests to remove from `tests/models/test_google.py` + +These tests are superseded by the new file: + +| Test in test_google.py | Line | Replaced by | +|---|---|---| +| `test_google_native_output_with_tools` | 2880 | case 7 | +| `test_google_builtin_tools_with_other_tools` | 3279 | cases 8, 9 | +| `test_google_native_output_with_builtin_tools_gemini_3` | 3315 | cases 4, 5, 6 | + +Note: `test_google_native_output` (line 2902) and `test_google_native_output_multiple` (line 2955) test NativeOutput WITHOUT tools - they could be moved too but are not replaced by our new tests. We can defer moving them to keep this PR focused. + +## Verification + +1. `make format && make lint` - style +2. `make typecheck 2>&1 | tee /tmp/typecheck-output.txt` - types +3. Record VCR cassettes: `source .env && uv run pytest tests/models/google/test_structured_output.py --record-mode=once -x -v` +4. Replay: `uv run pytest tests/models/google/test_structured_output.py -x -v` +5. Verify removed tests don't break: `uv run pytest tests/models/test_google.py -x -v` +6. Full test suite: `uv run pytest tests/ -x --timeout=60` + +## Files to modify + +- `pydantic_ai_slim/pydantic_ai/profiles/google.py` - add 2 flags +- `pydantic_ai_slim/pydantic_ai/models/google.py` - gate 2 restrictions +- `tests/models/google/__init__.py` - new (empty) +- `tests/models/google/conftest.py` - new (fixtures) +- `tests/models/google/test_structured_output.py` - new (tests) +- `tests/models/test_google.py` - remove 3 superseded tests +- `tests/models/cassettes/test_google/test_google_builtin_tools_with_other_tools.yaml` - delete (superseded cassette) +- `tests/models/cassettes/test_google/test_google_native_output_with_builtin_tools_gemini_3.yaml` - delete (superseded cassette) +- Note: `test_google_native_output_with_tools` has no cassette (errors before API call) From 3659d72ac6e2620428c24abdbcbb6d8fb13bcbbe Mon Sep 17 00:00:00 2001 From: hetzner Date: Thu, 26 Mar 2026 20:37:22 +0000 Subject: [PATCH 2/3] plan: update plan with SDK bump, single flag, and prepare_request changes Incorporates review feedback from DouweM: - Single `google_supports_function_tools_with_builtin_tools` flag - Skip prepare_request workaround for Gemini 3 - Always set `include_server_side_tool_invocations` with builtin tools - Bump google-genai minimum to >=1.68.0 - ToolOutput + builtin_tools now works on Gemini 3 (no workaround) Co-Authored-By: Claude Opus 4.6 (1M context) --- PLAN.md | 149 +++++++++++++++++++++++++++++--------------------------- 1 file changed, 77 insertions(+), 72 deletions(-) diff --git a/PLAN.md b/PLAN.md index c349f0da8f..9c6f3c1f43 100644 --- a/PLAN.md +++ b/PLAN.md @@ -7,6 +7,8 @@ Google Gemini 3 supports [structured outputs with all tools](https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured_outputs_with_tools) including function calling, and [combining builtin tools with function calling](https://ai.google.dev/gemini-api/docs/tool-combination). Currently pydantic-ai blocks both combinations unconditionally. +**Key concept**: `output_tools` (from `ToolOutput`) are function declarations from the API's perspective — `tool_defs = function_tools + output_tools` (line 651 of `models/__init__.py`). They all become `function_declarations` in the Google API. This means the restriction on function_declarations + builtin_tools affects output_tools equally. + **Comparison with other providers**: - **OpenAI**: Allows NativeOutput + function tools. No restrictions. - **Anthropic**: Allows NativeOutput + function tools (restricts only when thinking + output tools). @@ -14,71 +16,101 @@ Google Gemini 3 supports [structured outputs with all tools](https://ai.google.d ## Implementation -### 1. Profile: add two capability flags +### 1. Profile: add capability flag **File**: `pydantic_ai_slim/pydantic_ai/profiles/google.py` Add to `GoogleModelProfile`: ```python -google_supports_native_output_with_function_tools: bool = False google_supports_function_tools_with_builtin_tools: bool = False ``` -In `google_model_profile()`, set both to `is_3_or_newer`. +Set to `is_3_or_newer` in `google_model_profile()`. + +This single flag covers: +- function_tools + builtin_tools (#4788) +- output_tools + builtin_tools (since output_tools ARE function declarations) +- NativeOutput + function_tools (#4801) — see rationale at 2b -Two separate flags (vs one combined) follows the existing naming convention (`google_supports_native_output_with_builtin_tools`) and allows independent control. +We also keep `google_supports_native_output_with_builtin_tools` for Gemini 2 fallback behavior (prompted vs native workaround when the flag above is False). For Gemini 3, the workaround is skipped entirely. -### 2. Model: gate restrictions on profile flags +### 2. Model: three changes **File**: `pydantic_ai_slim/pydantic_ai/models/google.py` -#### 2a. `_get_tools` (line 444-446) - function tools + builtin tools +#### 2a. `prepare_request` (line 286-301) — remove workaround for Gemini 3 + +The existing workaround converts `ToolOutput` to `NativeOutput`/`PromptedOutput` when output_tools + builtin_tools are both present, because older models can't have function_declarations + builtin tools together. + +For Gemini 3, this workaround is unnecessary — output_tools (function declarations) can coexist with builtin tools. Skip the workaround when `google_supports_function_tools_with_builtin_tools` is True: + +```python +def prepare_request(self, ...): + google_profile = GoogleModelProfile.from_profile(self.profile) + if model_request_parameters.builtin_tools and model_request_parameters.output_tools: + if google_profile.google_supports_function_tools_with_builtin_tools: + pass # Gemini 3+: output_tools (function declarations) + builtin tools work fine + elif model_request_parameters.output_mode == 'auto': + output_mode = 'native' if google_profile.google_supports_native_output_with_builtin_tools else 'prompted' + model_request_parameters = replace(model_request_parameters, output_mode=output_mode) + else: + output_mode = 'NativeOutput' if google_profile.google_supports_native_output_with_builtin_tools else 'PromptedOutput' + raise UserError( + f'This model does not support output tools and built-in tools at the same time. Use `output_type={output_mode}(...)` instead.' + ) + return super().prepare_request(model_settings, model_request_parameters) +``` + +#### 2b. `_get_tools` (line 444-446) — function tools + builtin tools -Current: unconditional `raise UserError(...)` when both present. -Change: read `google_supports_function_tools_with_builtin_tools` from profile; only error if False. +Gate the restriction on the profile flag: ```python if model_request_parameters.builtin_tools: if model_request_parameters.function_tools: if not GoogleModelProfile.from_profile(self.profile).google_supports_function_tools_with_builtin_tools: - raise UserError('Google does not support function tools and built-in tools at the same time.') + raise UserError( + 'This model does not support function tools and built-in tools at the same time.' + ) ``` -When the flag is True, the code proceeds to append builtin tool dicts to the same `tools` list that already has function tool dicts. Both types become separate `ToolDict` entries in the list. +Error message updated to say "This model" per review feedback. -#### 2b. `_build_content_and_config` (line 537-541) - NativeOutput + function tools +#### 2c. `_build_content_and_config` (line 537-541) — NativeOutput + function tools -Current: unconditional `raise UserError(...)` when `output_mode == 'native'` and function_tools present. -Change: read `google_supports_native_output_with_function_tools` from profile; only error if False. +Gate the restriction on the same flag. Rationale: NativeOutput + function_tools is part of the same Gemini 3 'structured output with tools' capability — if the model supports function_declarations + builtin tools, it supports function_declarations + response_schema. ```python if model_request_parameters.output_mode == 'native': if model_request_parameters.function_tools: - if not GoogleModelProfile.from_profile(self.profile).google_supports_native_output_with_function_tools: + if not GoogleModelProfile.from_profile(self.profile).google_supports_function_tools_with_builtin_tools: raise UserError( - 'Google does not support `NativeOutput` and function tools at the same time. Use `output_type=ToolOutput(...)` instead.' + 'This model does not support `NativeOutput` and function tools at the same time. Use `output_type=ToolOutput(...)` instead.' ) response_mime_type = 'application/json' ... ``` -When allowed, `response_mime_type` + `response_json_schema` are set alongside the function tool declarations. The model can call tools when needed and must return structured JSON conforming to the schema for its final text response. +### 3. `_get_tool_config` — no changes needed -#### 2c. No changes to `prepare_request` +When `output_mode='native'`, `allow_text_output=True` (set by base class). `_get_tool_config` returns None (no forced tool calling). The model freely chooses between calling function tools and returning structured text. -The existing `prepare_request` (line 286-301) handles `output_tools + builtin_tools` using `google_supports_native_output_with_builtin_tools`. This already correctly resolves auto mode to 'native' for Gemini 3 when builtin_tools + output_tools are present. +### 4. SDK: `include_server_side_tool_invocations` + bump to `>=1.68.0` -For auto mode + function_tools + builtin_tools on Gemini 3: -1. `prepare_request`: output_tools + builtin_tools → auto converts to 'native' (existing logic) -2. Base class: clears output_tools, populates output_object -3. `_get_tools`: function_tools + builtin_tools → now allowed (fix 2a) -4. `_build_content_and_config`: native + function_tools → now allowed (fix 2b) +The [tool combination docs](https://ai.google.dev/gemini-api/docs/tool-combination) require `include_server_side_tool_invocations=True` when using builtin tools. This flag was added in **google-genai 1.68.0**. -Auto mode WITHOUT builtin_tools resolves to 'tool' (via `default_structured_output_mode='tool'`), so output_tools handle structured output. No change needed. +Per DouweM's review: always set this flag when ANY builtin tools are enabled, not just when combined with function tools. This gives us proper `toolCall`/`toolReturn` parts in responses, which we can use to build `BuiltinToolCallPart`/`BuiltinToolReturnPart` instead of reconstructing from `groundingMetadata`. -### 3. `_get_tool_config` - no changes needed +**Changes**: +- Bump minimum SDK pin from `>=1.56.0` to `>=1.68.0` in `pydantic_ai_slim/pyproject.toml` +- In `_build_content_and_config`: set `config['include_server_side_tool_invocations'] = True` whenever `model_request_parameters.builtin_tools` is non-empty +- Investigate building `BuiltinToolCallPart`/`BuiltinToolReturnPart` from `toolCall`/`toolReturn` response parts, potentially simplifying existing `groundingMetadata` reconstruction logic + +**Already handled by pydantic-ai** (no changes needed): +- `thought_signature` on `Part` — round-tripped via `provider_details` (`google.py:984-990`, `google.py:1161-1179`) +- `FunctionCall.id` / `FunctionResponse.id` — already mapped (`google.py:1016`, `google.py:1255-1256`) -When `output_mode='native'`, `allow_text_output=True` (set by base class). `_get_tool_config` returns None (no forced tool calling). The model freely chooses between calling function tools and returning structured text. Correct behavior per Google docs. +**Open question**: How much of the existing `groundingMetadata`-based reconstruction can we remove/simplify? This needs investigation during implementation — we should compare the `toolCall`/`toolReturn` response parts against the current reconstruction logic to see what becomes redundant. ## Edge cases @@ -89,38 +121,10 @@ When `output_mode='native'`, `allow_text_output=True` (set by base class). `_get | NativeOutput + function_tools | error | **works (new)** | | NativeOutput + function + builtin | error | **works (new)** | | function + builtin (no output type) | error | **works (new)** | -| auto + function + builtin + output_type | error | **works (new, auto->native)** | -| ToolOutput + builtin_tools | error | error (correct, use NativeOutput) | +| auto + function + builtin + output_type | error | **works (new, auto stays 'tool')** | +| ToolOutput + builtin_tools | error (workaround to native/prompted) | **works (new, no workaround)** | | ToolOutput + function_tools | works | works (standard tool mode) | -### 4. SDK investigation: `include_server_side_tool_invocations` - -The [tool combination docs](https://ai.google.dev/gemini-api/docs/tool-combination) say `include_server_side_tool_invocations=True` is required when combining function tools + builtin tools. - -**SDK support**: -- `include_server_side_tool_invocations` was added to `GenerateContentConfig` in **google-genai 1.68.0** ([release](https://github.com/googleapis/python-genai/releases/tag/v1.68.0)) -- Current minimum pin: `>=1.56.0` (in `pydantic_ai_slim/pyproject.toml`) -- Current installed: 1.56.0 - -**Already handled by pydantic-ai** (no changes needed): -- `thought_signature` on `Part` — exists in SDK 1.56.0, pydantic-ai already round-trips it via `provider_details` (`google.py:984-990`, `google.py:1161-1179`) -- `FunctionCall.id` / `FunctionResponse.id` — exists in SDK, already mapped (`google.py:1016`, `google.py:1255-1256`) - -**Approach**: -1. For **#4801** (NativeOutput + function tools): no SDK changes needed. Just lift the restriction. This doesn't involve builtin tools, so `include_server_side_tool_invocations` is irrelevant. -2. For **#4788** (function + builtin tools): test against the live API first WITHOUT setting `include_server_side_tool_invocations`: - - Existing builtin tool tests (Google Search, Code Execution) already work without this flag - - If the API rejects the combination without the flag, add it conditionally: - ```python - # In _build_content_and_config, when both tool types present: - if has_function_tools and has_builtin_tools: - config['include_server_side_tool_invocations'] = True - ``` - Since `GenerateContentConfigDict` is a TypedDict (dict at runtime), this key is silently passed through. On SDK >=1.68.0 it's recognized; on older versions the SDK ignores unknown dict keys when constructing the protobuf. - - If that also fails, we may need to bump the minimum SDK version to `>=1.68.0` for the tool combination feature — but this would be a last resort since it forces upgrades on all users. - -**No minimum version bump** unless API testing proves it necessary. Per AGENTS.md: "Verify provider limitations through testing before implementing workarounds." - ## Tests ### New file: `tests/models/google/test_structured_output.py` @@ -145,18 +149,19 @@ tests/models/google/ **VCR integration tests** (record against live API): -1. `test_native_output_with_function_tools` - Gemini 3 + NativeOutput(CityLocation) + function tool that returns data → assert structured output + all_messages snapshot +1. `test_native_output_with_function_tools` - Gemini 3 + NativeOutput(CityLocation) + function tool that returns data -> assert structured output + all_messages snapshot 2. `test_native_output_with_function_tools_stream` - same as above, streaming -3. `test_function_tools_with_builtin_tools` - Gemini 3 + function tool + WebSearchTool → assert response + messages -4. `test_native_output_with_function_and_builtin_tools` - Gemini 3 + NativeOutput + function tool + WebSearchTool → full combo -5. `test_native_output_with_builtin_tools` - Gemini 3 + NativeOutput + WebSearchTool (existing behavior, move from test_google.py) -6. `test_auto_mode_with_function_and_builtin_tools` - Gemini 3 + output_type=SomeModel + function tool + WebSearchTool → verify auto resolves to native +3. `test_function_tools_with_builtin_tools` - Gemini 3 + function tool + WebSearchTool -> assert response + messages +4. `test_native_output_with_function_and_builtin_tools` - Gemini 3 + NativeOutput + function tool + WebSearchTool -> full combo +5. `test_native_output_with_builtin_tools` - Gemini 3 + NativeOutput + WebSearchTool (move from test_google.py) +6. `test_tool_output_with_builtin_tools` - Gemini 3 + ToolOutput + WebSearchTool -> works now (no workaround) +7. `test_auto_mode_with_function_and_builtin_tools` - Gemini 3 + output_type=SomeModel + function tool + WebSearchTool -> verify auto stays 'tool' mode **Error tests** (no VCR needed, error before API call): -7. `test_native_output_with_function_tools_unsupported` - Gemini 2 + NativeOutput + function tool → UserError -8. `test_function_tools_with_builtin_tools_unsupported` - Gemini 2 + function + builtin → UserError -9. `test_tool_output_with_builtin_tools_error` - Gemini 3 + ToolOutput + builtin → UserError (correct - use NativeOutput) +8. `test_native_output_with_function_tools_unsupported` - Gemini 2 + NativeOutput + function tool -> UserError +9. `test_function_tools_with_builtin_tools_unsupported` - Gemini 2 + function + builtin -> UserError +10. `test_tool_output_with_builtin_tools_unsupported` - Gemini 2 + ToolOutput + builtin -> UserError (workaround suggests NativeOutput/PromptedOutput) ### Tests to remove from `tests/models/test_google.py` @@ -164,11 +169,11 @@ These tests are superseded by the new file: | Test in test_google.py | Line | Replaced by | |---|---|---| -| `test_google_native_output_with_tools` | 2880 | case 7 | -| `test_google_builtin_tools_with_other_tools` | 3279 | cases 8, 9 | +| `test_google_native_output_with_tools` | 2880 | case 8 | +| `test_google_builtin_tools_with_other_tools` | 3279 | cases 9, 10 | | `test_google_native_output_with_builtin_tools_gemini_3` | 3315 | cases 4, 5, 6 | -Note: `test_google_native_output` (line 2902) and `test_google_native_output_multiple` (line 2955) test NativeOutput WITHOUT tools - they could be moved too but are not replaced by our new tests. We can defer moving them to keep this PR focused. +Note: `test_google_native_output` (line 2902) and `test_google_native_output_multiple` (line 2955) test NativeOutput WITHOUT tools - could be moved later to keep this PR focused. ## Verification @@ -181,12 +186,12 @@ Note: `test_google_native_output` (line 2902) and `test_google_native_output_mul ## Files to modify -- `pydantic_ai_slim/pydantic_ai/profiles/google.py` - add 2 flags -- `pydantic_ai_slim/pydantic_ai/models/google.py` - gate 2 restrictions +- `pydantic_ai_slim/pydantic_ai/profiles/google.py` - add flag +- `pydantic_ai_slim/pydantic_ai/models/google.py` - update `prepare_request`, gate `_get_tools`, gate `_build_content_and_config`, set `include_server_side_tool_invocations` +- `pydantic_ai_slim/pyproject.toml` - bump google-genai minimum to `>=1.68.0` - `tests/models/google/__init__.py` - new (empty) - `tests/models/google/conftest.py` - new (fixtures) - `tests/models/google/test_structured_output.py` - new (tests) - `tests/models/test_google.py` - remove 3 superseded tests -- `tests/models/cassettes/test_google/test_google_builtin_tools_with_other_tools.yaml` - delete (superseded cassette) -- `tests/models/cassettes/test_google/test_google_native_output_with_builtin_tools_gemini_3.yaml` - delete (superseded cassette) -- Note: `test_google_native_output_with_tools` has no cassette (errors before API call) +- `tests/models/cassettes/test_google/test_google_builtin_tools_with_other_tools.yaml` - delete +- `tests/models/cassettes/test_google/test_google_native_output_with_builtin_tools_gemini_3.yaml` - delete From b47e962a472636d9dcf99307a58e1ca3dade8679 Mon Sep 17 00:00:00 2001 From: hetzner Date: Thu, 26 Mar 2026 21:45:54 +0000 Subject: [PATCH 3/3] plan: address review feedback on plan v2 - Rename flag to `google_supports_tool_combination` (matches Google docs) - Replace line number refs with function/class names - Negate condition instead of `if flag: pass` - Add deferred docs section (output.md, builtin-tools.md, models/google.md) - Clarify edge case table with UserError vs workaround distinction - Expand Context with three distinct API mechanisms Co-Authored-By: Claude Opus 4.6 (1M context) --- PLAN.md | 97 ++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 58 insertions(+), 39 deletions(-) diff --git a/PLAN.md b/PLAN.md index 9c6f3c1f43..4155a53b8f 100644 --- a/PLAN.md +++ b/PLAN.md @@ -7,7 +7,15 @@ Google Gemini 3 supports [structured outputs with all tools](https://ai.google.dev/gemini-api/docs/structured-output?example=recipe#structured_outputs_with_tools) including function calling, and [combining builtin tools with function calling](https://ai.google.dev/gemini-api/docs/tool-combination). Currently pydantic-ai blocks both combinations unconditionally. -**Key concept**: `output_tools` (from `ToolOutput`) are function declarations from the API's perspective — `tool_defs = function_tools + output_tools` (line 651 of `models/__init__.py`). They all become `function_declarations` in the Google API. This means the restriction on function_declarations + builtin_tools affects output_tools equally. +**Three distinct mechanisms in the Google API** (important for understanding the restrictions): + +1. **function_tools**: user-defined `function_declarations` the model can call (e.g. `get_weather`) +2. **output_tools** (from `ToolOutput`): also `function_declarations`, but used for structured output extraction — the model "calls" them to return typed data. Defined via `ModelRequestParameters.tool_defs` (`models/__init__.py`), which merges `function_tools + output_tools`. From the API's perspective, output_tools ARE function declarations. +3. **NativeOutput**: uses `response_schema` + `response_mime_type='application/json'` — the model returns structured JSON directly, no function calling involved. Completely separate from function declarations. + +The restrictions being lifted are: +- `function_declarations` + `builtin_tools` → affects both function_tools AND output_tools (#4788) +- `response_schema` + `function_declarations` → NativeOutput + function_tools (#4801) **Comparison with other providers**: - **OpenAI**: Allows NativeOutput + function tools. No restrictions. @@ -22,53 +30,52 @@ Google Gemini 3 supports [structured outputs with all tools](https://ai.google.d Add to `GoogleModelProfile`: ```python -google_supports_function_tools_with_builtin_tools: bool = False +google_supports_tool_combination: bool = False ``` Set to `is_3_or_newer` in `google_model_profile()`. -This single flag covers: -- function_tools + builtin_tools (#4788) -- output_tools + builtin_tools (since output_tools ARE function declarations) -- NativeOutput + function_tools (#4801) — see rationale at 2b +Named after [Google's own "tool combination" feature](https://ai.google.dev/gemini-api/docs/tool-combination). This single flag covers all three restriction lifts because they're all part of the same Gemini 3 capability set: +- function_tools + builtin_tools (#4788) — function declarations can coexist with builtin tools +- output_tools + builtin_tools — output_tools ARE function declarations (see Context above) +- NativeOutput + function_tools (#4801) — response_schema can coexist with function declarations -We also keep `google_supports_native_output_with_builtin_tools` for Gemini 2 fallback behavior (prompted vs native workaround when the flag above is False). For Gemini 3, the workaround is skipped entirely. +We keep the existing `google_supports_native_output_with_builtin_tools` for Gemini 2 fallback behavior (the `prepare_request` workaround that converts ToolOutput to NativeOutput/PromptedOutput when the new flag is False). For Gemini 3, the workaround is skipped entirely. ### 2. Model: three changes **File**: `pydantic_ai_slim/pydantic_ai/models/google.py` -#### 2a. `prepare_request` (line 286-301) — remove workaround for Gemini 3 +#### 2a. `GoogleModel.prepare_request` — skip workaround for Gemini 3 The existing workaround converts `ToolOutput` to `NativeOutput`/`PromptedOutput` when output_tools + builtin_tools are both present, because older models can't have function_declarations + builtin tools together. -For Gemini 3, this workaround is unnecessary — output_tools (function declarations) can coexist with builtin tools. Skip the workaround when `google_supports_function_tools_with_builtin_tools` is True: +For Gemini 3, this workaround is unnecessary — output_tools (function declarations) can coexist with builtin tools. Wrap existing logic in a negated check: ```python def prepare_request(self, ...): google_profile = GoogleModelProfile.from_profile(self.profile) if model_request_parameters.builtin_tools and model_request_parameters.output_tools: - if google_profile.google_supports_function_tools_with_builtin_tools: - pass # Gemini 3+: output_tools (function declarations) + builtin tools work fine - elif model_request_parameters.output_mode == 'auto': - output_mode = 'native' if google_profile.google_supports_native_output_with_builtin_tools else 'prompted' - model_request_parameters = replace(model_request_parameters, output_mode=output_mode) - else: - output_mode = 'NativeOutput' if google_profile.google_supports_native_output_with_builtin_tools else 'PromptedOutput' - raise UserError( - f'This model does not support output tools and built-in tools at the same time. Use `output_type={output_mode}(...)` instead.' - ) + if not google_profile.google_supports_tool_combination: + if model_request_parameters.output_mode == 'auto': + output_mode = 'native' if google_profile.google_supports_native_output_with_builtin_tools else 'prompted' + model_request_parameters = replace(model_request_parameters, output_mode=output_mode) + else: + output_mode = 'NativeOutput' if google_profile.google_supports_native_output_with_builtin_tools else 'PromptedOutput' + raise UserError( + f'This model does not support output tools and built-in tools at the same time. Use `output_type={output_mode}(...)` instead.' + ) return super().prepare_request(model_settings, model_request_parameters) ``` -#### 2b. `_get_tools` (line 444-446) — function tools + builtin tools +#### 2b. `GoogleModel._get_tools` — function tools + builtin tools Gate the restriction on the profile flag: ```python if model_request_parameters.builtin_tools: if model_request_parameters.function_tools: - if not GoogleModelProfile.from_profile(self.profile).google_supports_function_tools_with_builtin_tools: + if not GoogleModelProfile.from_profile(self.profile).google_supports_tool_combination: raise UserError( 'This model does not support function tools and built-in tools at the same time.' ) @@ -76,16 +83,17 @@ if model_request_parameters.builtin_tools: Error message updated to say "This model" per review feedback. -#### 2c. `_build_content_and_config` (line 537-541) — NativeOutput + function tools +#### 2c. `GoogleModel._build_content_and_config` — NativeOutput + function tools -Gate the restriction on the same flag. Rationale: NativeOutput + function_tools is part of the same Gemini 3 'structured output with tools' capability — if the model supports function_declarations + builtin tools, it supports function_declarations + response_schema. +Gate the restriction on the same flag. Both `response_schema + function_declarations` and `function_declarations + builtin_tools` are part of Gemini 3's tool combination capability: ```python if model_request_parameters.output_mode == 'native': if model_request_parameters.function_tools: - if not GoogleModelProfile.from_profile(self.profile).google_supports_function_tools_with_builtin_tools: + if not GoogleModelProfile.from_profile(self.profile).google_supports_tool_combination: raise UserError( - 'This model does not support `NativeOutput` and function tools at the same time. Use `output_type=ToolOutput(...)` instead.' + 'This model does not support `NativeOutput` and function tools at the same time. ' + 'Use `output_type=ToolOutput(...)` instead.' ) response_mime_type = 'application/json' ... @@ -107,22 +115,25 @@ Per DouweM's review: always set this flag when ANY builtin tools are enabled, no - Investigate building `BuiltinToolCallPart`/`BuiltinToolReturnPart` from `toolCall`/`toolReturn` response parts, potentially simplifying existing `groundingMetadata` reconstruction logic **Already handled by pydantic-ai** (no changes needed): -- `thought_signature` on `Part` — round-tripped via `provider_details` (`google.py:984-990`, `google.py:1161-1179`) -- `FunctionCall.id` / `FunctionResponse.id` — already mapped (`google.py:1016`, `google.py:1255-1256`) +- `thought_signature` on `Part` — round-tripped via `provider_details` in `_map_content` and `_map_part_to_content` +- `FunctionCall.id` / `FunctionResponse.id` — already mapped in `_map_content` and `_map_tool_return_part` **Open question**: How much of the existing `groundingMetadata`-based reconstruction can we remove/simplify? This needs investigation during implementation — we should compare the `toolCall`/`toolReturn` response parts against the current reconstruction logic to see what becomes redundant. ## Edge cases +Error types: **UserError** = pydantic-ai raises before API call; **workaround** = auto mode silently converts output mode + | Scenario | Gemini 2 | Gemini 3 | |---|---|---| | NativeOutput (no tools) | works | works | -| NativeOutput + builtin_tools | error | works (existing) | -| NativeOutput + function_tools | error | **works (new)** | -| NativeOutput + function + builtin | error | **works (new)** | -| function + builtin (no output type) | error | **works (new)** | -| auto + function + builtin + output_type | error | **works (new, auto stays 'tool')** | -| ToolOutput + builtin_tools | error (workaround to native/prompted) | **works (new, no workaround)** | +| NativeOutput + builtin_tools | not blocked by pydantic-ai (untested) | works (existing) | +| NativeOutput + function_tools | UserError in `_build_content_and_config` | **works (new)** | +| NativeOutput + function + builtin | UserError in `_get_tools` (hits function+builtin check first) | **works (new)** | +| function + builtin (no output type) | UserError in `_get_tools` | **works (new)** | +| auto + function + builtin + output_type | UserError in `_get_tools` | **works (new, auto stays 'tool')** | +| ToolOutput + builtin_tools (auto mode) | workaround to native/prompted in `prepare_request` | **works (new, no workaround)** | +| ToolOutput + builtin_tools (explicit) | UserError in `prepare_request` | **works (new, no workaround)** | | ToolOutput + function_tools | works | works (standard tool mode) | ## Tests @@ -167,13 +178,21 @@ tests/models/google/ These tests are superseded by the new file: -| Test in test_google.py | Line | Replaced by | -|---|---|---| -| `test_google_native_output_with_tools` | 2880 | case 8 | -| `test_google_builtin_tools_with_other_tools` | 3279 | cases 9, 10 | -| `test_google_native_output_with_builtin_tools_gemini_3` | 3315 | cases 4, 5, 6 | +| Test in test_google.py | Replaced by | +|---|---| +| `test_google_native_output_with_tools` | case 8 | +| `test_google_builtin_tools_with_other_tools` | cases 9, 10 | +| `test_google_native_output_with_builtin_tools_gemini_3` | cases 4, 5, 6 | + +Note: `test_google_native_output` and `test_google_native_output_multiple` test NativeOutput WITHOUT tools — could be moved later to keep this PR focused. + +## Documentation (deferred per PR flow) + +Per our PR flow, docs are deferred until after review confirms logic is correct. These files need updating once logic is finalized: -Note: `test_google_native_output` (line 2902) and `test_google_native_output_multiple` (line 2955) test NativeOutput WITHOUT tools - could be moved later to keep this PR focused. +- `docs/output.md` — remove/update the statement "Gemini cannot use tools at the same time as structured output" +- `docs/builtin-tools.md` — update Google row: "Using built-in tools and function tools (including output tools) at the same time is not supported" is no longer true for Gemini 3+ +- `docs/models/google.md` — add section on structured output + tool combination support for Gemini 3 ## Verification