Skip to content

fix(sdk): resize Anthropic many-image inputs#2552

Merged
xingyaoww merged 25 commits into
OpenHands:mainfrom
Zheng-Lu:fix/2467-image-downscale
May 5, 2026
Merged

fix(sdk): resize Anthropic many-image inputs#2552
xingyaoww merged 25 commits into
OpenHands:mainfrom
Zheng-Lu:fix/2467-image-downscale

Conversation

@Zheng-Lu

@Zheng-Lu Zheng-Lu commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

#2467

Summary

Reproduces and fixes the Anthropic many-image failure by resizing oversized base64 images during LLM message formatting.

What Changed

  • Added an Anthropic-only resize path in LLM.format_messages_for_llm
  • Resize only triggers when the outgoing request crosses the many-image threshold
  • Preserves aspect ratio and leaves URL images unchanged
  • Added pillow as a runtime dependency for in-memory image resizing

Validation

  • pytest tests/sdk/llm/test_llm_image_resizing.py Passed
  • pytest tests/sdk/llm/test_llm_image_resizing.py tests/sdk/llm/test_vision_support.py Passed
  • ruff check openhands-sdk/openhands/sdk/llm/llm.py tests/sdk/llm/test_llm_image_resizing.py Passed
  • pyright openhands-sdk/openhands/sdk/llm/llm.py tests/sdk/llm/test_llm_image_resizing.py Passed

Proof

Now the multiple-images request with at least one image > 2000px doesn't throw the error litellm.BadRequestError
image

Co-authored-by: openhands <openhands@all-hands.dev>
@xingyaoww

Copy link
Copy Markdown
Member

@OpenHands pls merge from main, resolve all conflicts. Then do /codereview-roasted /github-pr-review

@openhands-ai

openhands-ai Bot commented Mar 27, 2026

Copy link
Copy Markdown

I'm on it! xingyaoww can track my progress at all-hands.dev

Co-authored-by: openhands <openhands@all-hands.dev>

@xingyaoww xingyaoww left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable — Works, but the structure needs improvement

Linus's Three Questions:

  1. Is this solving a real problem? — Yes. Anthropic's many-image limit is a real production failure.
  2. Is there a simpler way? — Yes. This is ~80 lines of image manipulation code jammed into a 1500-line god-class. Extract it.
  3. What will this break? — Adding pillow as a hard runtime dependency to the core SDK is the biggest concern. Every user now pays for PIL whether they use images or not.

VERDICT:
Needs rework — The fix is directionally correct, but the dependency strategy and code placement need redesign before merging.

KEY INSIGHT:
The core problem is treating PIL as a hard SDK dependency and stuffing image-processing plumbing into the LLM class, when this should be a lazy-loaded utility module.

Comment thread openhands-sdk/openhands/sdk/llm/llm.py Outdated
Comment thread openhands-sdk/pyproject.toml
Comment thread openhands-sdk/openhands/sdk/llm/llm.py Outdated
Comment thread openhands-sdk/openhands/sdk/llm/llm.py Outdated
Comment thread openhands-sdk/openhands/sdk/llm/llm.py Outdated
Comment thread openhands-sdk/openhands/sdk/llm/llm.py Outdated
Comment thread tests/sdk/llm/test_llm_image_resizing.py
@openhands-ai

openhands-ai Bot commented Mar 27, 2026

Copy link
Copy Markdown

OpenHands encountered an error: Request timeout after 30 seconds to https://ypvqnvwqmvbumzax.prod-runtime.all-hands.dev/api/conversations/9650ebc3-b326-46ce-88ee-b683ce63e259/ask_agent

See the conversation for more information.

@Zheng-Lu Zheng-Lu requested a review from xingyaoww April 6, 2026 22:31
@xingyaoww

Copy link
Copy Markdown
Member

@OpenHands do /codereview-roasted /github-pr-review

@openhands-ai

openhands-ai Bot commented Apr 10, 2026

Copy link
Copy Markdown

I'm on it! xingyaoww can track my progress at all-hands.dev

@xingyaoww xingyaoww left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable — Core logic is correct and the tests are solid, but there are design issues worth addressing before merge.

Linus-Style Analysis:

The fundamental idea is sound: intercept oversized base64 images before they hit Anthropic's API limits. The Anthropic docs research is correct (20-image threshold, 2000px vs 8000px caps), and the resize logic itself is clean.

But the plumbing around it — threading a PIL module as Any through three functions, silent in-place mutation — adds unnecessary complexity that a simpler design would eliminate entirely.

⚠️ Eval-risk note: This PR changes image processing in the LLM message pipeline, which could affect agent behavior on vision benchmarks. Flagging for a human maintainer to decide after running lightweight evals.

This review was generated by an AI agent (OpenHands).

Comment thread openhands-sdk/openhands/sdk/llm/utils/image_resize.py Outdated
Comment thread openhands-sdk/openhands/sdk/llm/utils/image_resize.py Outdated
Comment thread openhands-sdk/openhands/sdk/llm/utils/image_resize.py
Comment thread openhands-sdk/pyproject.toml Outdated
Comment thread openhands-sdk/openhands/sdk/llm/utils/image_resize.py Outdated
Comment thread tests/sdk/llm/test_llm_image_resizing.py
@openhands-ai

openhands-ai Bot commented Apr 10, 2026

Copy link
Copy Markdown

Summary

The request was to perform a /codereview-roasted and /github-pr-review on PR #2552 ("fix(sdk): resize Anthropic many-image inputs"). Here's what was done:

Checklist

  • Retrieved PR diff and context — Fetched the full diff (5 files changed), PR description, and linked issue SDK should auto-resize oversized images before sending to LLM providers #2467
  • Performed roasted code review — Analyzed the code through the Linus Torvalds-style lens focusing on data structures, complexity, pragmatism, and breaking changes
  • Posted structured GitHub PR review — Submitted a single COMMENT review with 6 inline comments using priority labels (🟠🟠🟡🟡🟢🟢) via the GitHub API
  • Used correct review event — Left as COMMENT (not APPROVE) because the PR changes image processing in the LLM message pipeline, which falls under the eval-risk policy requiring human maintainer sign-off
  • No code modifications made — Review-only, as required by the codereview skill

Key Findings Posted

  1. 🟠 image_module: Any threading — Unnecessary complexity; Python's sys.modules cache makes repeated imports free
  2. 🟠 Silent in-place mutationmaybe_resize_messages_for_provider mutates inputs with no return value, relying on an invisible deepcopy contract
  3. 🟡 pillow>=12.1.1 floor too high — The APIs used are stable since Pillow 9.1+
  4. 🟡 Leaky public APIresize_base64_data_url exposes an Any-typed PIL parameter
  5. 🟢 Good dimension logic — Clean early returns, correct Anthropic doc mirroring
  6. 🟢 Solid tests — Real image creation and dimension assertions, not mock-only

No extraneous changes were made — this was purely a review action with no code modifications.

@all-hands-bot

Copy link
Copy Markdown
Collaborator

[Automatic Post]: This PR seems to be currently waiting for review. @xingyaoww @Zheng-Lu @openhands-ai[bot], could you please take a look when you have a chance?

Comment thread openhands-sdk/openhands/sdk/llm/utils/image_resize.py Outdated
@Zheng-Lu Zheng-Lu requested a review from xingyaoww April 19, 2026 00:00
Co-authored-by: openhands <openhands@all-hands.dev>
@all-hands-bot

Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @Zheng-Lu, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@all-hands-bot

Copy link
Copy Markdown
Collaborator

[Automatic Post]: This PR seems to be currently waiting for review. @xingyaoww, could you please take a look when you have a chance?

@xingyaoww

Copy link
Copy Markdown
Member

@OpenHands in the existing example file that tests for vision support, can you add another send_message that sends image inputs that would properly exercise the stuff we've introduced in this PR

@openhands-ai

openhands-ai Bot commented Apr 27, 2026

Copy link
Copy Markdown

I'm on it! xingyaoww can track my progress at all-hands.dev

Add a second send_message in 17_image_input.py that sends 21 oversized
base64 images (2500×100 px each), triggering the Anthropic many-image
threshold (>20 images → 2000 px cap). This exercises the image resizing
path introduced in this PR end-to-end.

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai

openhands-ai Bot commented Apr 27, 2026

Copy link
Copy Markdown

I'm on it! xingyaoww can track my progress at all-hands.dev

The wait-on-check-action was using github.event.pull_request.head.ref
(branch name) to look up check runs. For fork PRs, the branch name
doesn't exist in the upstream repo, causing a 422 error:
  'No commit found for SHA: fix/2467-image-downscale'

Switch to github.event.pull_request.head.sha which always resolves
correctly regardless of whether the PR is from a fork.

Co-authored-by: openhands <openhands@all-hands.dev>
@xingyaoww xingyaoww removed the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 27, 2026
@xingyaoww xingyaoww added the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 27, 2026 — with OpenHands AI
The Build & Push workflow doesn't run for fork PRs, so the
wait-on-check-action was timing out and failing with
'The requested check was never run against this ref'.

Set fail-on-no-checks: false so the wait step is a no-op when
the check doesn't exist (fork PRs) while still waiting properly
when it does (non-fork PRs).

Co-authored-by: openhands <openhands@all-hands.dev>
@xingyaoww xingyaoww removed the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 27, 2026
@xingyaoww xingyaoww added the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 27, 2026 — with OpenHands AI
Fork PRs have restricted GITHUB_TOKEN permissions and cannot write
comments on the upstream repo's PR. The update_comment function was
calling exit 1 on failure, which aborted the entire test run.

Change to emit warnings and continue when comment creation fails,
clearing API_URL to skip subsequent comment attempts.

Co-authored-by: openhands <openhands@all-hands.dev>
@xingyaoww xingyaoww removed the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 27, 2026
@xingyaoww xingyaoww added the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 27, 2026 — with OpenHands AI
@openhands-ai

openhands-ai Bot commented Apr 27, 2026

Copy link
Copy Markdown

Summary

The user asked me to fix the CI failure at https://github.com/OpenHands/software-agent-sdk/actions/runs/25000384169/job/73209940791?pr=2552 for PR #2552 on a fork branch.

Was the request addressed?

  • Identified the root cause: The test-examples workflow had three separate issues preventing it from running on fork PRs.
  • Fix 1 — head.refhead.sha: The wait-on-check-action used the branch name to look up check runs, which doesn't exist in the upstream repo for fork PRs (422 error). Changed to use the commit SHA.
  • Fix 2 — fail-on-no-checks: false: The Build & Push (python-amd64) check doesn't run for fork PRs, causing a timeout. Made the wait step tolerate missing checks.
  • Fix 3 — Non-fatal comment posting: Fork PRs lack write permissions to post PR comments (403 error). Changed exit 1 to warnings + return so test execution isn't aborted.
  • Merged main into the branch to bring it up to date (clean merge, no conflicts).
  • Pushed all changes to the fork branch and re-triggered the workflow.
  • Verified the wait step now passes successfully and the workflow progresses to running actual tests.

Are the changes concise?

Yes — all three changes are to .github/workflows/run-examples.yml and are minimal, targeted fixes for fork PR compatibility. No extraneous changes were introduced. The merge from main was also requested context (branch was behind).

Remaining limitation

The 48/53 example test failures after the infrastructure fixes are due to GitHub Actions not exposing repository secrets (LLM_API_KEY, etc.) to fork PR workflows. This is expected security behavior affecting all fork PRs and cannot be fixed from the PR side — a maintainer would need to re-run from the upstream context.

@xingyaoww xingyaoww added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Apr 27, 2026
@all-hands-bot

Copy link
Copy Markdown
Collaborator

[Automatic Post]: It has been a while since there was any activity on this PR. @Zheng-Lu, are you still working on it? If so, please go ahead, if not then please request review, close it, or request that someone else follow up.

@xingyaoww

Copy link
Copy Markdown
Member

@OpenHands please merge from main and resolve conflicts

@openhands-ai

openhands-ai Bot commented May 5, 2026

Copy link
Copy Markdown

I'm on it! xingyaoww can track my progress at all-hands.dev

…lict

Keep main's litellm>=1.83.7 and PR's pillow>=12.1.1 dependency.

Co-authored-by: openhands <openhands@all-hands.dev>
@openhands-ai

openhands-ai Bot commented May 5, 2026

Copy link
Copy Markdown

Summary

The request to merge from main and resolve conflicts on PR #2552 has been fully addressed:

Checklist

  • Fetched the PR branch from the fork (Zheng-Lu/software-agent-sdk)
  • Merged origin/main into fix/2467-image-downscale
  • Resolved the single conflict in openhands-sdk/pyproject.toml:
    • Kept main's updated litellm>=1.83.7 (replacing the PR's older litellm>=1.82.6,!=1.82.7,!=1.82.8)
    • Preserved the PR's new pillow>=12.1.1 dependency (needed for image resizing)
  • Verified no other conflictsllm.py auto-merged cleanly
  • Committed and pushed the merge to the fork remote

Conciseness

The changes are minimal — only the merge commit resolving one conflict. No extraneous changes were made.

@xingyaoww xingyaoww left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Sorry for taking so long :(

@xingyaoww xingyaoww enabled auto-merge (squash) May 5, 2026 15:38
@xingyaoww xingyaoww merged commit 1d932c8 into OpenHands:main May 5, 2026
26 checks passed
StressTestor pushed a commit to StressTestor/software-agent-sdk that referenced this pull request Jun 1, 2026
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyao@all-hands.dev>
Co-authored-by: Xingyao Wang <xingyaoww@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test-examples Run all applicable "examples/" files. Expensive operation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants