Skip to content

Conversation

@Fridge003
Copy link
Collaborator

Motivation

Ref:
When releasing sgl-kernel, we have one flow for cu129 and another flow for cu130 (ref: https://github.com/sgl-project/sglang/actions/runs/19008270795/job/54304471430)

So during updating wheel index, they will push all the compiled wheels to the same folder, causing some of the cu129 wheels uploaded to cu130 folder (or vice versa). Of course we can manually fix this after every release (for example sgl-project/whl#10), but this doesn't help with efficiency.

This PR tries to fix this issue by modifying the script for wheel index update.

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Fridge003, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue in the update_kernel_whl_index.py script where kernel wheels for different CUDA versions were being incorrectly uploaded to the wrong directories. The changes introduce a robust mechanism to identify and filter wheel files based on their embedded CUDA version, ensuring that each wheel is processed and placed into its correct, version-specific index folder. This streamlines the release process and eliminates the need for manual corrections.

Highlights

  • CUDA Version Handling: Introduced explicit handling for multiple CUDA versions (129 and 130) by defining SUPPORTED_CUDA_VERSIONS and DEFAULT_CUDA_VERSION constants within the wheel index update script.
  • Wheel Filtering Logic: Added a new function, check_wheel_cuda_version, to accurately identify if a wheel file corresponds to a target CUDA version. This function correctly differentiates between wheels with explicit CUDA suffixes and those for the default CUDA version which may lack a suffix.
  • Script Enhancement: Modified the update_wheel_index function to utilize the new check_wheel_cuda_version filtering logic. This ensures that only wheels matching the specified CUDA version are processed and uploaded to their respective directories, preventing miscategorization.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly addresses the issue of mixing wheels for different CUDA versions during the index update process. By introducing a check to filter wheels based on the target CUDA version, the script now correctly separates wheels for cu129 and cu130. I've added one suggestion to make the CUDA version check in the filename more robust to prevent potential false positives.

@zhyncs zhyncs merged commit 15efbcb into main Nov 4, 2025
26 checks passed
@zhyncs zhyncs deleted the baizhou/fix-whl branch November 4, 2025 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants