[rollout, hardware] feat: support MXFP8 rollout on Ascend NPU #4679

SlightwindSec · 2025-12-26T06:49:26Z

What does this PR do?

This PR aims to support MXFP8 (Microscaling Formats) rollout on Ascend NPU hardware.
Note: This is a Draft PR for early feedback and collaboration. The core online MXFP8 quantization logic is currently under development (see TODOs in the code).

Checklist Before Starting

Search for similar PRs.
Format the PR title as [{modules}] {type}: {description}

Test

Since this is a Draft PR and core quantization is pending, full end-to-end results (training curves) are not yet available.

Functional validation on Ascend NPU (Planned)

Performance benchmark comparison between BF16/MXFP8 (Planned)

API and Usage Example

actor_rollout_ref.rollout.quantization=mxfp8

Design & Code Changes

The high-level design focuses on integrating MXFP8 scaling logic into the rollout worker.
Specific Changes:

Modified verl/workers/rollout to recognize mxfp8 as a valid precision type.

Added hardware-specific checks for Ascend NPU in the rollout initialization.

[TODO] Implement core online MXFP8 quantization kernels/wrappers.

[TODO] Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)

Signed-off-by: SlightwindSec <[email protected]>

CLAassistant · 2025-12-26T06:49:33Z

All committers have signed the CLA.

gemini-code-assist

Code Review

This pull request introduces support for MXFP8 quantization on Ascend NPUs. The changes correctly add 'mxfp8' as a supported quantization type and include logic to handle its configuration. My review focuses on ensuring the correctness and maintainability of the new code paths.

I've identified a few critical and high-severity issues:

There's a potentially incorrect use of fp8-specific patches for the new mxfp8 implementation path in two different files, which could lead to incorrect behavior.
I found duplicated logic for handling quantization setup across two files, which will make future maintenance difficult.
There's an instance of a hardcoded value that should be replaced with a defined constant to improve maintainability.

My suggestions aim to fix these issues by removing the risky patch calls, using constants, and highlighting the need to refactor duplicated code.

verl/workers/rollout/vllm_rollout/vllm_async_server.py

verl/workers/rollout/vllm_rollout/vllm_rollout.py

verl/utils/vllm/vllm_fp8_utils.py

Signed-off-by: SlightwindSec <[email protected]>

Support Ascend-NPU MXFP8 Rollout

7ee94ec

Signed-off-by: SlightwindSec <[email protected]>

gemini-code-assist bot reviewed Dec 26, 2025

View reviewed changes

verl/workers/rollout/vllm_rollout/vllm_async_server.py Outdated Show resolved Hide resolved

verl/workers/rollout/vllm_rollout/vllm_rollout.py Outdated Show resolved Hide resolved

verl/utils/vllm/vllm_fp8_utils.py Outdated Show resolved Hide resolved

SlightwindSec added 2 commits December 29, 2025 20:00

add npu_scaled_mxfp8_blockwise impl

5de845c

Signed-off-by: SlightwindSec <[email protected]>

update block_size

6ffbab1

Signed-off-by: SlightwindSec <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rollout, hardware] feat: support MXFP8 rollout on Ascend NPU #4679

[rollout, hardware] feat: support MXFP8 rollout on Ascend NPU #4679

Uh oh!

SlightwindSec commented Dec 26, 2025

Uh oh!

CLAassistant commented Dec 26, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[rollout, hardware] feat: support MXFP8 rollout on Ascend NPU #4679

Are you sure you want to change the base?

[rollout, hardware] feat: support MXFP8 rollout on Ascend NPU #4679

Uh oh!

Conversation

SlightwindSec commented Dec 26, 2025

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

[TODO] Checklist Before Submitting

Uh oh!

CLAassistant commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented Dec 26, 2025 •

edited

Loading