Skip to content
Open
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
e37b0bc
warn about azure openai completions file incompatibility
dsfaccini Jan 20, 2026
91808ff
fix exampels
dsfaccini Jan 20, 2026
7cdf685
move example over to azure
dsfaccini Jan 23, 2026
e6d4f06
fix link
dsfaccini Jan 23, 2026
9e0a102
add test for coverage
dsfaccini Jan 23, 2026
5c45e72
add test for coverage
dsfaccini Jan 23, 2026
cb6aa6b
coverage
dsfaccini Jan 25, 2026
17c6c25
fix test
dsfaccini Jan 25, 2026
63d7e6d
coverage
dsfaccini Jan 25, 2026
91b8337
Merge branch 'main' into review-azure-file-support
dsfaccini Jan 26, 2026
a20068c
Merge branch 'main' into review-azure-file-support
dsfaccini Jan 27, 2026
098f5ec
Merge branch 'main' into review-azure-file-support
dsfaccini Feb 5, 2026
584a99d
Address review: rename file→document, fix docs
dsfaccini Feb 15, 2026
61e4324
Re-record test_yaml_document_url_input cassette
dsfaccini Feb 15, 2026
67c8371
Merge branch 'main' into review-azure-file-support
dsfaccini Feb 15, 2026
caec194
Add SSRF fixture to test_yaml_document_url_input and re-record cassette
dsfaccini Feb 15, 2026
ee4a419
Add tests for DocumentUrl path in document input not supported error
dsfaccini Feb 15, 2026
b663951
address review feedback: remove symlink, fix docs table, update model…
dsfaccini Feb 27, 2026
aff00f3
Merge remote-tracking branch 'upstream/main' into review-azure-file-s…
dsfaccini Mar 19, 2026
22d356d
Update docs/models/openai.md
dsfaccini Mar 19, 2026
a10f9f1
remove unnecessary backticks
dsfaccini Mar 19, 2026
f882fe8
prepush review
dsfaccini Mar 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/input.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ Support for file URLs varies depending on type and provider:

| Model | Send URL directly | Download and send bytes | Unsupported |
|-------|-------------------|-------------------------|-------------|
| [`OpenAIChatModel`][pydantic_ai.models.openai.OpenAIChatModel] | `ImageUrl` | `AudioUrl`, `DocumentUrl` | `VideoUrl` |
| [`OpenAIChatModel`][pydantic_ai.models.openai.OpenAIChatModel] | `ImageUrl` | `AudioUrl`, `DocumentUrl`* | `VideoUrl` |
| [`OpenAIResponsesModel`][pydantic_ai.models.openai.OpenAIResponsesModel] | `ImageUrl`, `AudioUrl`, `DocumentUrl` | — | `VideoUrl` |
| [`AnthropicModel`][pydantic_ai.models.anthropic.AnthropicModel] | `ImageUrl`, `DocumentUrl` (PDF) | `DocumentUrl` (`text/plain`) | `AudioUrl`, `VideoUrl` |
| [`GoogleModel`][pydantic_ai.models.google.GoogleModel] (Vertex) | All URL types | — | — |
Expand All @@ -120,7 +120,11 @@ Support for file URLs varies depending on type and provider:
| [`BedrockConverseModel`][pydantic_ai.models.bedrock.BedrockConverseModel] | S3 URLs (`s3://`) | `ImageUrl`, `DocumentUrl`, `VideoUrl` | `AudioUrl` |
| [`OpenRouterModel`][pydantic_ai.models.openrouter.OpenRouterModel] | `ImageUrl`, `DocumentUrl` | `AudioUrl` | `VideoUrl` |

A model API may be unable to download a file (e.g., because of crawling or access restrictions) even if it supports file URLs. For example, [`GoogleModel`][pydantic_ai.models.google.GoogleModel] on Vertex AI limits YouTube video URLs to one URL per request. In such cases, you can instruct Pydantic AI to download the file content locally and send that instead of the URL by setting `force_download` on the URL object:
*Not supported with `AzureProvider`. Use [`OpenAIResponsesModel` with `AzureProvider`](models/openai.md#using-azure-with-the-responses-api) instead.

A model API may be unable to download a file (e.g., because of crawling or access restrictions) even if it supports file URLs. For example, [`GoogleModel`][pydantic_ai.models.google.GoogleModel] on Vertex AI limits YouTube video URLs to one URL per request.

In such cases, you can instruct Pydantic AI to download the file content locally and send that instead of the URL by setting `force_download` on the URL object:

```py {title="force_download.py" test="skip" lint="skip"}
from pydantic_ai import ImageUrl, AudioUrl, VideoUrl, DocumentUrl
Expand Down
26 changes: 26 additions & 0 deletions docs/models/openai.md
Original file line number Diff line number Diff line change
Expand Up @@ -427,6 +427,32 @@ agent = Agent(model)
...
```

#### Using Azure with the Responses API

Azure AI Foundry also supports the OpenAI Responses API through [`OpenAIResponsesModel`][pydantic_ai.models.openai.OpenAIResponsesModel]. This is particularly recommended when working with document inputs (`DocumentUrl` and `BinaryContent`), as Azure's Chat Completions API does not support these input types.

??? example "Document processing with Azure using Responses API"
```python
from pydantic_ai import Agent, BinaryContent
from pydantic_ai.models.openai import OpenAIResponsesModel
from pydantic_ai.providers.azure import AzureProvider

pdf_bytes = b'%PDF-1.4 ...' # Your PDF content

model = OpenAIResponsesModel(
'gpt-5',
provider=AzureProvider(
azure_endpoint='your-azure-endpoint',
api_version='your-api-version',
),
)
agent = Agent(model)
result = agent.run_sync([
'Summarize this document',
BinaryContent(data=pdf_bytes, media_type='application/pdf'),
])
```

### Vercel AI Gateway

To use [Vercel's AI Gateway](https://vercel.com/docs/ai-gateway), first follow the [documentation](https://vercel.com/docs/ai-gateway) instructions on obtaining an API key or OIDC token.
Expand Down
1 change: 1 addition & 0 deletions learnings
18 changes: 16 additions & 2 deletions pydantic_ai_slim/pydantic_ai/models/openai.py
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Potential bypass of document support check when openai_chat_supports_file_urls=True

In _map_document_url_item at pydantic_ai_slim/pydantic_ai/models/openai.py:1245, the first branch checks not item.force_download and profile.openai_chat_supports_file_urls and returns a File content part directly WITHOUT checking openai_chat_supports_document_input. If a provider were configured with openai_chat_supports_file_urls=True AND openai_chat_supports_document_input=False, the document support check would be bypassed. Currently no provider has this combination (only OpenRouter sets openai_chat_supports_file_urls=True, and it supports documents), so this is not a practical issue today. But it's a latent inconsistency that could matter if a new provider is added with this combination.

(Refers to line 1245)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the openai_chat_supports_file_urls flag inherently implies the provider supports documents, so I don't think we need to explicitly handle the combination

for future reference, if we found these inconsistencies, having a single dataclass that uses property to validate combinations would be a better approach than checking multiple flags, since adding a check branch here would bloat this out further

Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

from pydantic import BaseModel, TypeAdapter, ValidationError
from pydantic_core import to_json
from typing_extensions import assert_never, deprecated
from typing_extensions import Never, assert_never, deprecated

from .. import ModelAPIError, ModelHTTPError, UnexpectedModelBehavior, _utils, usage
from .._output import DEFAULT_OUTPUT_TOOL_NAME, OutputObjectDefinition
Expand Down Expand Up @@ -1184,6 +1184,8 @@ async def _map_user_prompt(self, part: UserPromptPart) -> chat.ChatCompletionUse
audio = InputAudio(data=item.base64, format=item.format)
content.append(ChatCompletionContentPartInputAudioParam(input_audio=audio, type='input_audio'))
elif item.is_document:
if not profile.openai_chat_supports_document_input:
self._raise_document_input_not_supported_error()
content.append(
File(
file=FileFile(
Expand Down Expand Up @@ -1227,6 +1229,8 @@ async def _map_user_prompt(self, part: UserPromptPart) -> chat.ChatCompletionUse
)
)
else:
if not profile.openai_chat_supports_document_input:
self._raise_document_input_not_supported_error()
downloaded_item = await download_item(item, data_format='base64_uri', type_format='extension')
content.append(
File(
Expand All @@ -1237,7 +1241,7 @@ async def _map_user_prompt(self, part: UserPromptPart) -> chat.ChatCompletionUse
type='file',
)
)
elif isinstance(item, VideoUrl): # pragma: no cover
elif isinstance(item, VideoUrl):
raise NotImplementedError('VideoUrl is not supported for OpenAI')
elif isinstance(item, CachePoint):
# OpenAI doesn't support prompt caching via CachePoint, so we filter it out
Expand All @@ -1246,6 +1250,16 @@ async def _map_user_prompt(self, part: UserPromptPart) -> chat.ChatCompletionUse
assert_never(item)
return chat.ChatCompletionUserMessageParam(role='user', content=content)

def _raise_document_input_not_supported_error(self) -> Never:
if self._provider.name == 'azure':
raise UserError(
"Azure's Chat Completions API does not support document input. "
'Use `OpenAIResponsesModel` with `AzureProvider` instead.'
)
raise UserError(
f'The {self._provider.name!r} provider does not support document input via the Chat Completions API.'
)

@staticmethod
def _is_text_like_media_type(media_type: str) -> bool:
return (
Expand Down
6 changes: 6 additions & 0 deletions pydantic_ai_slim/pydantic_ai/profiles/openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,12 @@ class OpenAIModelProfile(ModelProfile):
See https://github.com/pydantic/pydantic-ai/issues/3245 for more details.
"""

openai_chat_supports_document_input: bool = True
"""Whether the Chat Completions API supports document content parts (type='file').

Some OpenAI-compatible providers (e.g. Azure) do not support document input via the Chat Completions API.
"""

def __post_init__(self): # pragma: no cover
if not self.openai_supports_sampling_settings:
warnings.warn(
Expand Down
9 changes: 7 additions & 2 deletions pydantic_ai_slim/pydantic_ai/providers/azure.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,15 @@ def model_profile(self, model_name: str) -> ModelProfile | None:

# As AzureProvider is always used with OpenAIChatModel, which used to unconditionally use OpenAIJsonSchemaTransformer,
# we need to maintain that behavior unless json_schema_transformer is set explicitly
return OpenAIModelProfile(json_schema_transformer=OpenAIJsonSchemaTransformer).update(profile)
# Azure Chat Completions API doesn't support document input
return OpenAIModelProfile(
json_schema_transformer=OpenAIJsonSchemaTransformer,
openai_chat_supports_document_input=False,
).update(profile)

# OpenAI models are unprefixed
return openai_model_profile(model_name)
# Azure Chat Completions API doesn't support document input
return OpenAIModelProfile(openai_chat_supports_document_input=False).update(openai_model_profile(model_name))

@overload
def __init__(self, *, openai_client: AsyncAzureOpenAI) -> None: ...
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
interactions:
- request:
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate, br
connection:
- keep-alive
content-length:
- '279'
content-type:
- application/json
host:
- api.openai.com
method: POST
parsed_body:
messages:
- content:
- text: What does this YAML describe?
type: text
- text: |-
-----BEGIN FILE id="a5bdf9" type="application/x-yaml"-----
name: test
version: 1.0.0
-----END FILE id="a5bdf9"-----
type: text
role: user
model: gpt-4o
stream: false
uri: https://api.openai.com/v1/chat/completions
response:
headers:
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
connection:
- keep-alive
content-length:
- '1880'
content-type:
- application/json
openai-organization:
- user-grnwlxd1653lxdzp921aoihz
openai-processing-ms:
- '4743'
openai-project:
- proj_FYsIItHHgnSPdHBVMzhNBWGa
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
transfer-encoding:
- chunked
parsed_body:
choices:
- finish_reason: stop
index: 0
logprobs: null
message:
annotations: []
content: |-
The provided YAML snippet is a basic descriptor for something labeled with the name "test" and a version number "1.0.0". Without additional context or accompanying fields, it's difficult to definitively say what specific application or resource this is describing. In a general sense, such a YAML configuration could be used for various purposes, including but not limited to:

1. **Software/Application:** It could describe a software application or component called "test" with version 1.0.0.
2. **Configuration Management:** It might be a part of a configuration management system for managing different versions of a service or application.
3. **Package Information:** If used in a package management context, it might represent metadata for a package or library.
4. **Service Definition:** It could represent a service or microservice within a larger system.

Each of these interpretations would depend on the broader context in which this YAML file is used. Further fields in the YAML file would provide more specificity about its purpose and functionality.
refusal: null
role: assistant
created: 1769199521
id: chatcmpl-D1Hu5C2mqc2CPw07SQa6U7Ki9PF7X
model: gpt-4o-2024-08-06
object: chat.completion
service_tier: default
system_fingerprint: fp_deacdd5f6f
usage:
completion_tokens: 202
completion_tokens_details:
accepted_prediction_tokens: 0
audio_tokens: 0
reasoning_tokens: 0
rejected_prediction_tokens: 0
prompt_tokens: 57
prompt_tokens_details:
audio_tokens: 0
cached_tokens: 0
total_tokens: 259
status:
code: 200
message: OK
version: 1
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
interactions:
- request:
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate, br
connection:
- keep-alive
content-length:
- '308'
content-type:
- application/json
host:
- api.openai.com
method: POST
parsed_body:
messages:
- content:
- text: What type of configuration is this?
type: text
- text: |-
-----BEGIN FILE id="45a391" type="application/yaml"-----
version: "3"
services:
web:
image: nginx
-----END FILE id="45a391"-----
type: text
role: user
model: gpt-4o
stream: false
uri: https://api.openai.com/v1/chat/completions
response:
headers:
access-control-expose-headers:
- X-Request-ID
alt-svc:
- h3=":443"; ma=86400
connection:
- keep-alive
content-length:
- '1218'
content-type:
- application/json
openai-organization:
- user-grnwlxd1653lxdzp921aoihz
openai-processing-ms:
- '1676'
openai-project:
- proj_FYsIItHHgnSPdHBVMzhNBWGa
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=31536000; includeSubDomains; preload
transfer-encoding:
- chunked
parsed_body:
choices:
- finish_reason: stop
index: 0
logprobs: null
message:
annotations: []
content: The configuration you provided is a YAML file for Docker Compose. Docker Compose is a tool used for defining
and running multi-container Docker applications. In this specific configuration, the YAML file is specifying a
single service called `web`, which uses the `nginx` Docker image. The file starts with specifying the Compose
file version as "3", indicating the format version used for composing the services.
refusal: null
role: assistant
created: 1769190655
id: chatcmpl-D1Fb52cAhS0I5T514KLWFLTvsJHYv
model: gpt-4o-2024-08-06
object: chat.completion
service_tier: default
system_fingerprint: fp_a0e9480a2f
usage:
completion_tokens: 77
completion_tokens_details:
accepted_prediction_tokens: 0
audio_tokens: 0
reasoning_tokens: 0
rejected_prediction_tokens: 0
prompt_tokens: 55
prompt_tokens_details:
audio_tokens: 0
cached_tokens: 0
total_tokens: 132
status:
code: 200
message: OK
version: 1
Loading
Loading