Skip to content

fix(core): use parsed mime_type for base64 file blocks in openai translator#36940

Open
Anmol Jaiswal (anmolg1997) wants to merge 1 commit intolangchain-ai:masterfrom
anmolg1997:fix/openai-file-block-mime-type
Open

fix(core): use parsed mime_type for base64 file blocks in openai translator#36940
Anmol Jaiswal (anmolg1997) wants to merge 1 commit intolangchain-ai:masterfrom
anmolg1997:fix/openai-file-block-mime-type

Conversation

@anmolg1997
Copy link
Copy Markdown

Fixes #36939.

The file branch of _convert_openai_format_to_data_block hard-codes mime_type="application/pdf", while the image branch right above it uses parsed["mime_type"] from the data URI. So a CSV sent via the OpenAI file block shape comes out with mime_type="application/pdf" in the v1 content block.

One-line change to read it off the parsed data URI, same as the image branch. _parse_data_uri returns None when the mime_type is missing, so parsed["mime_type"] is always set inside this branch.

Test added with a CSV and a text/plain data URI. Existing tests still pass since they use data:application/pdf;....

@github-actions

This comment has been minimized.

@github-actions github-actions Bot closed this Apr 22, 2026
@github-actions github-actions Bot reopened this Apr 26, 2026
@ccurme ccurme (ccurme) changed the title core[patch]: use parsed mime_type for base64 file blocks in openai translator fix(core): use parsed mime_type for base64 file blocks in openai translator Apr 26, 2026
@github-actions github-actions Bot added the fix For PRs that implement a fix label Apr 26, 2026
@ccurme
Copy link
Copy Markdown
Collaborator

Closing pending discussion on the issue.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 29, 2026

Merging this PR will not alter performance

✅ 13 untouched benchmarks
⏩ 2 skipped benchmarks1


Comparing anmolg1997:fix/openai-file-block-mime-type (58f0d26) with master (dfb8a61)2

Open in CodSpeed

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on master (666dc16) during the generation of this report, so dfb8a61 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@anmolg1997 Anmol Jaiswal (anmolg1997) force-pushed the fix/openai-file-block-mime-type branch from e81839c to 805b11c Compare April 29, 2026 04:51
@github-actions github-actions Bot added size: S 50-199 LOC and removed size: XS < 50 LOC labels Apr 29, 2026
@anmolg1997
Copy link
Copy Markdown
Author

Pushed an update following the discussion on #36939.

Kept the original mime_type fix in _convert_openai_format_to_data_block, and added a small guard in convert_to_openai_data_block(api="chat/completions") that raises a clear ValueError for non-PDF base64 file blocks instead of letting OpenAI return a raw 400. Tests cover both behaviors. Responses API path is unchanged.

… file blocks

The base64 file branch in `_convert_openai_format_to_data_block` was
hard-coding `mime_type="application/pdf"`, while the image branch right
above used `parsed["mime_type"]`. So a non-PDF data URI (e.g. CSV)
passed via the OpenAI Chat Completions file block shape got silently
relabeled as PDF in the v1 content block, which is wrong for non-OpenAI
chat models that consume v1 blocks via shared `_normalize_messages`.

Changes:

1. Use `parsed["mime_type"]` in the base64 file branch, matching the
   image branch right above it.
2. In `convert_to_openai_data_block(api="chat/completions")`, raise a
   clear `ValueError` when MIME is not `application/pdf`, pointing the
   caller to the Responses API. This keeps Chat Completions semantics
   intact and fails fast with a friendlier error than OpenAI's raw 400.
3. Regression tests for both behaviors.

Fixes langchain-ai#36939
@anmolg1997 Anmol Jaiswal (anmolg1997) force-pushed the fix/openai-file-block-mime-type branch from 805b11c to 58f0d26 Compare April 29, 2026 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core `langchain-core` package issues & PRs external fix For PRs that implement a fix new-contributor size: S 50-199 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

core: _convert_openai_format_to_data_block hard-codes mime_type on base64 file blocks

3 participants