Update gptoss-fp4-b200-vllm vLLM image to v0.20.2 by Klaud-Cold · Pull Request #1333 · SemiAnalysisAI/InferenceX

Klaud-Cold · 2026-05-12T21:31:35Z

Summary

Update gptoss-fp4-b200-vllm image from vllm/vllm-openai:v0.15.1 to vllm/vllm-openai:v0.20.2

…-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

github-actions · 2026-05-12T21:31:44Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude

LGTM — straightforward image version bump with matching changelog entry.

Extended reasoning...

Overview

This PR makes two minimal changes: bumps the gptoss-fp4-b200-vllm image from vllm/vllm-openai:v0.15.1 to vllm/vllm-openai:v0.20.2 in .github/configs/nvidia-master.yaml, and adds a corresponding entry to perf-changelog.yaml documenting the bump. Both changes are config/metadata only.

Security risks

None. These are YAML config edits to a benchmarks/recipes registry. There is no executable code, no auth/crypto/permissions change, and the new image tag points to an upstream public vLLM release.

Level of scrutiny

Low. This matches the established pattern for image version bumps in this repo (similar to commit 7afb3e9 already on main, and many prior changelog entries above 2345 follow the same shape). The change is mechanical and self-contained.

Other factors

The bug hunter flagged a nit — the changelog pr-link uses a literal XXX placeholder rather than 1333. That is metadata-only and is already surfaced as an inline comment for the author to fix before merge; it does not affect runtime behavior, so it is not blocking for approval.

claude · 2026-05-12T21:35:03Z

+    - gptoss-fp4-b200-vllm
+  description:
+    - "Update vLLM image from v0.15.1 to v0.20.2"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX


🟡 The new perf-changelog.yaml entry uses a literal 'XXX' placeholder in its pr-link (https://github.com/SemiAnalysisAI/InferenceX/pull/XXX) instead of the actual PR number. Since this is PR #1333, the link should be .../pull/1333; as written, anyone following the changelog cross-reference will hit a 404. Fix by replacing XXX with 1333.

Extended reasoning...

What the bug is\n\nThe diff adds a new entry to perf-changelog.yaml at line 2346–2351 documenting the vLLM image bump for gptoss-fp4-b200-vllm. The final field is:\n\nyaml\n pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX\n\n\nThe XXX is a literal placeholder string that was never substituted with the real PR number. This PR is #1333, so the URL should be https://github.com/SemiAnalysisAI/InferenceX/pull/1333.\n\nWhy existing code doesn'''t prevent it\n\nThe changelog is a free-form YAML document; nothing in the repo validates that pr-link URLs point to existing PRs, and YAML syntax accepts the placeholder string just fine. The convention is enforced socially — every other recent entry (lines 2332 → /pull/1304, 2338 → /pull/1305, 2345 → /pull/1310, etc.) uses the real PR number, so this one stands out as an oversight rather than a structural failure.\n\nImpact\n\nThis is metadata only and has no runtime effect — benchmarks, image selection, and CI all ignore the changelog. The harm is purely to changelog traceability: any reader (engineer, release manager, future PR author) who clicks through to understand the context behind the v0.15.1 → v0.20.2 vLLM bump will land on a GitHub 404, since /pull/XXX is not a valid PR. The change history breaks at exactly the point where it'''s being recorded.\n\nHow to fix\n\nReplace XXX with 1333 on line 2351:\n\nyaml\n pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1333\n\n\nStep-by-step proof\n\n1. Open the PR in GitHub — the URL bar shows .../pull/1333, so this PR'''s number is 1333.\n2. View the diff for perf-changelog.yaml. The added block ends with: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX.\n3. Construct the resulting URL: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX. XXX is not a valid integer PR identifier, so GitHub returns a 404 Not Found.\n4. Compare against the entry immediately above (lines 2343–2345): pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1310 — a real PR number, which resolves correctly.\n5. The correct value for this entry, by the same convention, is 1333. Replacing XXX → 1333 restores changelog traceability.\n\nSeverity\n\nNit — the placeholder is clearly an unintentional oversight (a sed/replace step that was skipped) and trivially fixable before merge, but it does not affect runtime behavior.

github-actions · 2026-05-13T00:49:12Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25763432241
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25763432241

$Update gptoss-fp4-b200-vllm vLLM image to v0.20.2\n\nRef #1154\n\nCo…

7afb3e9

…-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

Klaud-Cold requested a review from a team May 12, 2026 21:31

Klaud-Cold added the full-sweep-enabled label May 12, 2026

Klaud-Cold requested review from jgangani and kedarpotdar-nv as code owners May 12, 2026 21:31

github-project-automation Bot added this to InferenceMAX Board May 12, 2026

Klaud-Cold mentioned this pull request May 12, 2026

[Auto] Docker Image Updates Available - 2026-04-25 #1154

Open

claude Bot reviewed May 12, 2026

View reviewed changes

Merge branch 'main' into claude/issue-1154-gptoss-fp4-b200-vllm

3ddbf47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update gptoss-fp4-b200-vllm vLLM image to v0.20.2#1333

Update gptoss-fp4-b200-vllm vLLM image to v0.20.2#1333
Klaud-Cold wants to merge 2 commits into
mainfrom
claude/issue-1154-gptoss-fp4-b200-vllm

Klaud-Cold commented May 12, 2026

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot May 12, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Klaud-Cold commented May 12, 2026

Summary

Uh oh!

github-actions Bot commented May 12, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants