Skip to content

fix: topK uint32 overflow#2937

Closed
Linda-Stadter wants to merge 2 commits intoflashinfer-ai:mainfrom
Linda-Stadter:topk_overflow
Closed

fix: topK uint32 overflow#2937
Linda-Stadter wants to merge 2 commits intoflashinfer-ai:mainfrom
Linda-Stadter:topk_overflow

Conversation

@Linda-Stadter
Copy link
Copy Markdown
Contributor

@Linda-Stadter Linda-Stadter commented Apr 1, 2026

📌 Description

This PR is based on top of #2661. Merge afterwards.

Fixes uint32 overflow in top_k when batch_size * vocab_size > 2^32

The row offset row_idx * stride was computed in uint32 arithmetic, silently
wrapping to zero for large inputs. Cast to size_t before the multiplication.

Add a regression test with batch=32769, vocab=131072 (fp16) that crosses the
overflow boundary.

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

jiangyinzuo and others added 2 commits March 31, 2026 00:30
also add cub stable radix sort and overflow handling

Co-authored-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 1, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fd690d55-f195-46b2-9e2a-114315775aa3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces deterministic mode for FlashInfer's top-k operations, including basic top-k selection and fused transforms (page table and ragged). The implementation includes new CUDA kernels for deterministic index collection and post-processing stable sorts to ensure bitwise-reproducible results across different runs and environments. The Python API has been updated to expose a deterministic flag, and the benchmarking suite is expanded to include DeepSeek DSA-like workloads and various input patterns to stress-test tie-handling logic. Extensive tests have been added to verify repeatability and correctness. I have no feedback to provide as there were no review comments to assess.

@aleozlx
Copy link
Copy Markdown
Collaborator

aleozlx commented Apr 1, 2026

cherry picked into #2661
closing for now

@aleozlx aleozlx closed this Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants