[CI]repair custom ops ci by ZT-AIA · Pull Request #9465 · vllm-project/vllm-ascend

ZT-AIA · 2026-05-22T08:46:45Z

What this PR does / why we need it?

Fix the nightly custom ops test cases; this is mainly caused by changes in vllm and inherent defects in the test cases themselves.

Does this PR introduce any user-facing change?

no

How was this patch tested?

nightly custom op

vLLM version: v0.20.2
vLLM main: vllm-project/vllm@1ac10f1

Signed-off-by: ZT-AIA <1028681969@qq.com>

github-actions · 2026-05-22T08:47:07Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions · 2026-05-22T08:47:48Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

gemini-code-assist · 2026-05-22T08:52:00Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses stability issues in the nightly custom operations CI pipeline. It updates various test configurations to ensure proper environment initialization and aligns test logic with recent changes in the vllm codebase. Additionally, it optimizes test coverage for specific operations to improve execution efficiency.

Highlights

CI Stability Improvements: Updated multiple test files to ensure consistent initialization of device properties and custom operations, resolving failures in the nightly CI pipeline.
Test Logic Alignment: Adjusted test mocks and logic to align with recent upstream vllm changes, including updates to fused operations and rejection sampling.
Test Coverage Optimization: Reduced redundant test parameters in specific operations like RoPE and split QKV to improve CI execution efficiency and stability.

New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

Suggested PR Title:

[Ops][Misc] Refactor Triton operation tests and simplify logprob computation

Suggested PR Summary:

### What this PR does / why we need it?
This pull request refactors and optimizes several NPU-specific operation tests and simplifies the logprob computation logic. It introduces necessary initialization calls (`init_device_properties_triton`, `enable_custom_op`) across various test files and streamlines the `compute_topk_logprobs` signature by removing unused parameters. Additionally, it adjusts test parametrizations for performance and updates rejection sampling kernels.

Feedback from the review highlights a redundant class redefinition in `test_fused_moe.py` that shadows an import. It also points out that the newly added `num_rejected_tokens` buffer in `test_prepare_inputs_padded.py` is currently unverified, suggesting an assertion against the reference implementation is needed.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Tested via the updated nightly E2E test suite for single-node operations.

gemini-code-assist · 2026-05-22T08:54:15Z

+class SiluAndMul:
+    """SwiGLU activation function: silu(x[:d]) * x[d:] where d = x.shape[-1] // 2"""
+    def __call__(self, x: torch.Tensor) -> torch.Tensor:
+        d = x.shape[-1] // 2
+        return F.silu(x[..., :d]) * x[..., d:]


The local redefinition of SiluAndMul shadows the import from vllm.model_executor.layers.activation at line 30. This is redundant and can lead to confusion for maintainers. If the local implementation is required due to changes in vLLM, the unused import should be removed to maintain code clarity. Otherwise, consider using the imported class directly.

gemini-code-assist · 2026-05-22T08:54:15Z

    # Run Triton kernel
    out_tri = torch.empty(num_reqs, dtype=torch.int32, device=device)
-
+    num_rejected_tokens = torch.empty(num_reqs, dtype=torch.int32, device=device)


The num_rejected_tokens buffer is initialized and passed to the kernel, but its output is never verified. Since the reference implementation prepare_inputs_padded_ref already calculates this value (line 24), the test should be updated to assert that the kernel's output matches the reference. This ensures the kernel's logic for calculating rejected tokens is correctly verified.

Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>

Signed-off-by: ZT-AIA <1028681969@qq.com>

github-actions · 2026-05-25T08:23:42Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>

MengqingCao · 2026-05-25T10:35:46Z

    torch.testing.assert_close(y_ref, y_cal, rtol=3e-03, atol=1e-02, equal_nan=True)


+@pytest.mark.skip(reason="Tested separately on a 310P machine.")


Suggested change

@pytest.mark.skip(reason="Tested separately on a 310P machine.")

@pytest.mark.skipif(not is_310p_hw(), reason="Tested separately on a 310P machine.")

Signed-off-by: ZT-AIA <1028681969@qq.com>

MengqingCao

LGTM now, thx!

### What this PR does / why we need it? Fix the nightly custom ops test cases; this is mainly caused by changes in vllm and inherent defects in the test cases themselves. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@1ac10f1 --------- Signed-off-by: ZT-AIA <1028681969@qq.com> Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com> Signed-off-by: yilunh <hanyilun1@huawei.com>

### What this PR does / why we need it? Fix the nightly custom ops test cases; this is mainly caused by changes in vllm and inherent defects in the test cases themselves. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@1ac10f1 --------- Signed-off-by: ZT-AIA <1028681969@qq.com> Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>

repair custom ops ci

576459d

Signed-off-by: ZT-AIA <1028681969@qq.com>

ZT-AIA requested review from MengqingCao, realliujiaxu, wangxiyuan, whx-sjtu and zzzzwwjj as code owners May 22, 2026 08:46

github-actions Bot added module:tests module:ops labels May 22, 2026

github-actions Bot added the merge-conflicts label May 22, 2026

gemini-code-assist Bot reviewed May 22, 2026

View reviewed changes

Merge branch 'main' into 0521

c014904

Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>

github-actions Bot removed the merge-conflicts label May 23, 2026

ZT-AIA and others added 4 commits May 23, 2026 19:13

fix

a3371df

Signed-off-by: ZT-AIA <1028681969@qq.com>

fix

23dd3ea

Signed-off-by: ZT-AIA <1028681969@qq.com>

Merge branch 'main' into 0521

f75150d

Merge branch 'main' into 0521

0a38b16

github-actions Bot added the merge-conflicts label May 25, 2026

Merge branch 'main' into 0521

12002a2

Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>

github-actions Bot removed the merge-conflicts label May 25, 2026

MengqingCao reviewed May 25, 2026

View reviewed changes

fix

9abdd6c

Signed-off-by: ZT-AIA <1028681969@qq.com>

MengqingCao approved these changes May 25, 2026

View reviewed changes

MengqingCao merged commit f650855 into vllm-project:main May 25, 2026
54 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI]repair custom ops ci#9465

[CI]repair custom ops ci#9465
MengqingCao merged 8 commits into
vllm-project:mainfrom
ZT-AIA:0521

ZT-AIA commented May 22, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

gemini-code-assist Bot commented May 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Uh oh!

gemini-code-assist Bot May 22, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

MengqingCao May 25, 2026

Uh oh!

MengqingCao left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		torch.testing.assert_close(y_ref, y_cal, rtol=3e-03, atol=1e-02, equal_nan=True)


		@pytest.mark.skip(reason="Tested separately on a 310P machine.")

	@pytest.mark.skip(reason="Tested separately on a 310P machine.")
	@pytest.mark.skipif(not is_310p_hw(), reason="Tested separately on a 310P machine.")

Conversation

ZT-AIA commented May 22, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

gemini-code-assist Bot commented May 22, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

MengqingCao May 25, 2026

Choose a reason for hiding this comment

Uh oh!

MengqingCao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZT-AIA commented May 22, 2026 •

edited by github-actions Bot

Loading