[Feature] improve random_sample by yuejun-baidu · Pull Request #359 · baidu/vLLM-Kunlun

yuejun-baidu · 2026-05-18T05:48:59Z

PR Description

improve random_sample with xspeedgate.ops
remove useless code in vllm_kunlun/v1/attention/backends/kunlun_attn.py
fix kunlun/intel platform confliction in vllm_kunlun/ops/fla/utils.py

Checklist (Required)

Before submitting this PR, please ensure that all the following items are completed:

All code changes pass the pre-commit checks.
Commits are signed off using git commit -s.
The PR title is properly classified (see below).

PR Type

Please prefix the PR title with one or more of the following labels to help reviewers quickly understand the nature of the change:

[Feature] – New features or enhancements (e.g. Attention, Communicator, Kernel, Worker, etc.)
[Bugfix] – Bug fixes
[CI/Build] – CI, build system, or infrastructure improvements
[Doc] – Documentation updates or fixes
[Misc] – Other changes that do not fit the above categories (use sparingly)

Note: If the PR spans multiple categories, include all relevant prefixes.

Detailed Checklist (Click to Expand)

Thank you for contributing to vLLM Kunlun! To help us maintain high code quality and streamline the review process, please ensure your PR meets the following requirements.

1. Code Quality

All linting and formatting checks pass (pre-commit).
The code is well-structured and sufficiently documented.
The change is designed with maintainability and readability in mind.

2. Testing

Relevant unit tests are added or updated.
Integration tests are included when applicable.
Existing tests continue to pass.

3. DCO Compliance

This project follows the Developer Certificate of Origin (DCO).

All commits include a Signed-off-by: line.
Use git commit -s to automatically add the sign-off.

4. Review Expectations

During the review process, maintainers may:

Request code refactoring or additional tests.
Ask for clarifications on design decisions.
Suggest performance, stability, or maintainability improvements.

We appreciate your patience and collaboration throughout the review process!

Signed-off-by: yuejun <yuejun@baidu.com>

Copilot

Pull request overview

Small performance/cleanup PR that swaps the in-place exponential sampling in random_sample to use the xspeedgate_ops.inplace_exponential custom op, removes a dead seq_start_loc computation in the Kunlun attention backend, and corrects the FLA platform mapping so xpu resolves to kunlun instead of intel.

Changes:

Replace q[i].exponential_(generator=...) loop with torch.ops.xspeedgate_ops.inplace_exponential to speed up random sampling.
Drop unused seq_start_loc/seq_start_loc_tensor construction in KunlunAttentionMetadataBuilder.build.
Update _check_platform mapping/return type so xpu maps to "kunlun" (replacing prior "intel" and dropping "musa").

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
vllm_kunlun/v1/sample/ops/topk_topp_sampler.py	Use xspeedgate inplace exponential op in the per-generator path.
vllm_kunlun/v1/attention/backends/kunlun_attn.py	Remove dead `seq_start_loc` tensor construction.
vllm_kunlun/ops/fla/utils.py	Map `xpu` → `kunlun` (drop intel/musa) in `_check_platform`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: yuejun <yuejun@baidu.com>

improve random_sample.

bd1cb81

Signed-off-by: yuejun <yuejun@baidu.com>

xyDong0223 approved these changes May 19, 2026

View reviewed changes

xyDong0223 requested a review from Copilot May 19, 2026 02:19

Copilot started reviewing on behalf of xyDong0223 May 19, 2026 02:19 View session

Copilot AI reviewed May 19, 2026

View reviewed changes

Comment thread vllm_kunlun/v1/attention/backends/kunlun_attn.py

remove useless "import accumulate"

3e15755

Signed-off-by: yuejun <yuejun@baidu.com>

liwei109 approved these changes May 19, 2026

View reviewed changes

liwei109 merged commit 4cc631d into baidu:main May 19, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] improve random_sample#359

[Feature] improve random_sample#359
liwei109 merged 2 commits into
baidu:mainfrom
yuejun-baidu:fix-sampler

yuejun-baidu commented May 18, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yuejun-baidu commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Description

Checklist (Required)

PR Type

1. Code Quality

2. Testing

3. DCO Compliance

4. Review Expectations

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yuejun-baidu commented May 18, 2026 •

edited

Loading