core[patch]: preserve MIME type on base64 file blocks in openai translator#36937
Closed
Anmol Jaiswal (anmolg1997) wants to merge 1 commit intolangchain-ai:masterfrom
Closed
core[patch]: preserve MIME type on base64 file blocks in openai translator#36937Anmol Jaiswal (anmolg1997) wants to merge 1 commit intolangchain-ai:masterfrom
Anmol Jaiswal (anmolg1997) wants to merge 1 commit intolangchain-ai:masterfrom
Conversation
…lator `_convert_openai_format_to_data_block` in the `openai` block translator hard-codes `mime_type="application/pdf"` for every base64 file block. Any non-PDF file (e.g. `data:text/csv;base64,...`, `data:text/plain`, spreadsheets, office docs) is silently relabeled as `application/pdf` on the way into v1 content blocks. The sibling `image_url` branch right above it already reads `parsed["mime_type"]` off the data URI. This patch does the same for the file branch, so both base64 paths are consistent and the incoming MIME type round-trips correctly. `_parse_data_uri` guarantees `mime_type` is non-empty whenever `parsed` is truthy, so no extra None check is needed. Adds a regression test covering CSV and plain-text base64 file blocks.
|
This PR has been automatically closed because you are not assigned to the linked issue. External contributors must be assigned to an issue before opening a PR for it. Please:
Maintainers: reopen this PR or remove the |
|
You have been assigned to #36939, but this PR could not be reopened because the head branch has been deleted. Please open a new PR referencing the issue. |
1 similar comment
|
You have been assigned to #36939, but this PR could not be reopened because the head branch has been deleted. Please open a new PR referencing the issue. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #36939.
The file branch of
_convert_openai_format_to_data_blockhard-codesmime_type="application/pdf", while the image branch right above it usesparsed["mime_type"]from the data URI. So a CSV sent via the OpenAI file block shape comes out withmime_type="application/pdf"in the v1 content block.One-line change to read it off the parsed data URI, same as the image branch.
_parse_data_urireturnsNonewhen the mime_type is missing, soparsed["mime_type"]is always set inside this branch.Test added with a CSV and a text/plain data URI. Existing tests still pass since they use
data:application/pdf;....