[CI]repair custom ops ci#9465
Conversation
Signed-off-by: ZT-AIA <1028681969@qq.com>
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses stability issues in the nightly custom operations CI pipeline. It updates various test configurations to ensure proper environment initialization and aligns test logic with recent changes in the vllm codebase. Additionally, it optimizes test coverage for specific operations to improve execution efficiency. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
Suggested PR Title:
[Ops][Misc] Refactor Triton operation tests and simplify logprob computationSuggested PR Summary:
### What this PR does / why we need it?
This pull request refactors and optimizes several NPU-specific operation tests and simplifies the logprob computation logic. It introduces necessary initialization calls (`init_device_properties_triton`, `enable_custom_op`) across various test files and streamlines the `compute_topk_logprobs` signature by removing unused parameters. Additionally, it adjusts test parametrizations for performance and updates rejection sampling kernels.
Feedback from the review highlights a redundant class redefinition in `test_fused_moe.py` that shadows an import. It also points out that the newly added `num_rejected_tokens` buffer in `test_prepare_inputs_padded.py` is currently unverified, suggesting an assertion against the reference implementation is needed.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Tested via the updated nightly E2E test suite for single-node operations.| class SiluAndMul: | ||
| """SwiGLU activation function: silu(x[:d]) * x[d:] where d = x.shape[-1] // 2""" | ||
| def __call__(self, x: torch.Tensor) -> torch.Tensor: | ||
| d = x.shape[-1] // 2 | ||
| return F.silu(x[..., :d]) * x[..., d:] |
There was a problem hiding this comment.
The local redefinition of SiluAndMul shadows the import from vllm.model_executor.layers.activation at line 30. This is redundant and can lead to confusion for maintainers. If the local implementation is required due to changes in vLLM, the unused import should be removed to maintain code clarity. Otherwise, consider using the imported class directly.
| # Run Triton kernel | ||
| out_tri = torch.empty(num_reqs, dtype=torch.int32, device=device) | ||
|
|
||
| num_rejected_tokens = torch.empty(num_reqs, dtype=torch.int32, device=device) |
There was a problem hiding this comment.
The num_rejected_tokens buffer is initialized and passed to the kernel, but its output is never verified. Since the reference implementation prepare_inputs_padded_ref already calculates this value (line 24), the test should be updated to assert that the kernel's output matches the reference. This ensures the kernel's logic for calculating rejected tokens is correctly verified.
Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>
| torch.testing.assert_close(y_ref, y_cal, rtol=3e-03, atol=1e-02, equal_nan=True) | ||
|
|
||
|
|
||
| @pytest.mark.skip(reason="Tested separately on a 310P machine.") |
There was a problem hiding this comment.
| @pytest.mark.skip(reason="Tested separately on a 310P machine.") | |
| @pytest.mark.skipif(not is_310p_hw(), reason="Tested separately on a 310P machine.") |
### What this PR does / why we need it? Fix the nightly custom ops test cases; this is mainly caused by changes in vllm and inherent defects in the test cases themselves. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@1ac10f1 --------- Signed-off-by: ZT-AIA <1028681969@qq.com> Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com> Signed-off-by: yilunh <hanyilun1@huawei.com>
### What this PR does / why we need it? Fix the nightly custom ops test cases; this is mainly caused by changes in vllm and inherent defects in the test cases themselves. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@1ac10f1 --------- Signed-off-by: ZT-AIA <1028681969@qq.com> Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>
### What this PR does / why we need it? Fix the nightly custom ops test cases; this is mainly caused by changes in vllm and inherent defects in the test cases themselves. - vLLM version: v0.20.2 - vLLM main: vllm-project/vllm@1ac10f1 --------- Signed-off-by: ZT-AIA <1028681969@qq.com> Signed-off-by: ZT-AIA <63220130+ZT-AIA@users.noreply.github.com>
What this PR does / why we need it?
Fix the nightly custom ops test cases; this is mainly caused by changes in vllm and inherent defects in the test cases themselves.
Does this PR introduce any user-facing change?
no
How was this patch tested?
nightly custom op