Fix(embeddings): Wrap each separate input in a Content+Part to fix batching by yorickvP · Pull Request #4873 · pydantic/pydantic-ai

yorickvP · 2026-03-27T13:32:07Z

gemini-embedding-2-preview was interpreting an array as a single multi-part embedding request, causing only one embedding to be returned.

This commit wraps each separate input in a Content+Part object to fix that issue.

Closes Bug when using gemini-embedding-2-preview #4872

Pre-Review Checklist

Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
No breaking changes in accordance with the version policy.
Linting and type checking pass per make format and make typecheck.
PR title is fit for the release changelog.

Pre-Merge Checklist

New tests for any fix or new behavior, maintaining 100% coverage.
Updated documentation for new features and behaviors, including docstrings for API docs.

gemini-embedding-2-preview was interpreting an array as a single multi-part embedding request, causing only one embedding to be returned. This commit wraps each separate input in a Content+Part object to fix that issue.

devin-ai-integration

Devin Review found 1 potential issue.

devin-ai-integration · 2026-03-27T13:35:00Z

pydantic_ai_slim/pydantic_ai/embeddings/google.py

            title=settings.get('google_title'),
        )

+        contents: ContentListUnion = [Content(parts=[Part(text=text)]) for text in inputs]


🚩 Content role defaults to None instead of 'user'

The old code passed raw strings (list[str]) which the Google SDK internally converted to Content objects with role='user'. The new code at line 166 creates Content(parts=[Part(text=text)]) without specifying role, which defaults to role=None. I verified this by inspecting the SDK: Content(parts=[Part(text='hello')]) yields role=None.

The existing VCR cassette (tests/cassettes/test_embeddings/TestGoogle.test_query.yaml:22) shows role: user in the recorded request body. However, VCR matching is configured in tests/conftest.py to only match on method and path (not body), so tests still pass.

For the embedding API specifically, the role field is semantically irrelevant — the API extracts text from parts regardless of role. This is not a bug, but if strict request parity with the old behavior is desired, role='user' could be added explicitly. The cassettes should ideally be re-recorded to reflect the actual new request format.

Was this helpful? React with 👍 or 👎 to provide feedback.

Fix(embeddings): Wrap each separate input in a Content+Part

88d1e29

gemini-embedding-2-preview was interpreting an array as a single multi-part embedding request, causing only one embedding to be returned. This commit wraps each separate input in a Content+Part object to fix that issue.

github-actions bot added size: S Small PR (≤100 weighted lines) bug Report that something isn't working, or PR implementing a fix labels Mar 27, 2026

devin-ai-integration bot reviewed Mar 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix(embeddings): Wrap each separate input in a Content+Part to fix batching#4873

Fix(embeddings): Wrap each separate input in a Content+Part to fix batching#4873
yorickvP wants to merge 1 commit intopydantic:mainfrom
datakami:embed-proper-contentlistunion

yorickvP commented Mar 27, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yorickvP commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-Review Checklist

Pre-Merge Checklist

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yorickvP commented Mar 27, 2026 •

edited

Loading