fix: detect image MIME type via magic bytes and add count budget for tool results#4940
Open
mossgowild wants to merge 1 commit intomicrosoft:mainfrom
Open
fix: detect image MIME type via magic bytes and add count budget for tool results#4940mossgowild wants to merge 1 commit intomicrosoft:mainfrom
mossgowild wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…tool results - Add detectImageMimeType() using magic bytes (JPEG/PNG/GIF/WebP/BMP) to correctly identify image format regardless of declared file extension. Fixes Anthropic API error where declared media_type doesn't match actual image content (e.g. file named .png but contains JPEG data). - Use detected MIME type in imageDataPartToTSX() instead of relying on part.mimeType which is inferred from file extension by VS Code core. - Add imageCountBudgetLeft to PrimitiveToolResult to cap the number of images injected per tool result at half of endpoint.maxPromptImages, reserving budget for user-attached images. Fixes Gemini error when terminal output references more images than the model supports. - Add unit tests for detectImageMimeType covering all supported formats and edge cases.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Two issues when tool results contain images (introduced by terminal image extraction in vscode#301560):
Wrong MIME type for Anthropic — VS Code core infers
mimeTypefrom file extension (e.g. a file named.pngcontaining JPEG data). When this is sent to Anthropic, the API rejects it with "declared media_type doesn't match actual image content".Too many images for Gemini — When terminal output or other tool results reference many screenshots/images, all of them get injected into the prompt with no limit, causing Gemini to error with
max_prompt_imagesexceeded.Related: microsoft/vscode#307431
Fix
Magic bytes detection —
detectImageMimeType()inspects the first bytes of image data (JPEGFF D8 FF, PNG89 50 4E 47, GIF, WebP, BMP) and returns the correct MIME type. Used inimageDataPartToTSX()instead of trustingpart.mimeType.Image count budget —
PrimitiveToolResultnow tracksimageCountBudgetLeft, initialized toendpoint.maxPromptImages / 2(reserving budget for user-attached images). Once exhausted, additional images from tool results are silently skipped.Also adds null-safety for
this.endpointaccess in a few places.Testing
Unit tests for
detectImageMimeTypecovering all supported formats, edge cases (short data, empty data, unknown format), and the specific bug scenario (JPEG content with PNG extension).