[Fusion] [Graph]Add Matmul Allreduce Rmsnorm fusion Pass #5034

wxsIcey · 2025-12-15T10:47:54Z

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

Signed-off-by: wxsIcey <[email protected]>

gemini-code-assist

Code Review

This pull request introduces a new fusion pass for Matmul -> AllReduce -> RMSNorm to optimize performance on Ascend hardware. The changes include a new configuration flag, the fusion pass implementation, and its integration into the compilation process. My review has identified a few issues: there are some leftover debugging print statements that should be removed. More critically, the new fusion pass contains a bug where tensor parallel rank and world size are hardcoded to 0, which will cause failures in distributed setups. There are also some logging statements with inappropriately high severity levels that could flood production logs. I've provided suggestions to fix these issues.

gemini-code-assist · 2025-12-15T10:50:21Z

vllm_ascend/compilation/passes/allreduce_rmsnorm_fusion_pass.py

+            out0, out1 = torch.ops._C_ascend.matmul_allreduce_add_rmsnorm(x, weight, residual, rms_norm_weight,
+                            self.tp_group_name, 0, 0, self.eps, True, True)


The tpRankSize and tpRankId arguments for torch.ops._C_ascend.matmul_allreduce_add_rmsnorm are hardcoded to 0. This is a critical bug that will cause incorrect behavior in distributed environments. Please use the tensor parallel world size and the correct rank ID.

While self.local_rank is correctly initialized, the world size is missing. You can get it using get_tp_group().world_size. For better performance, consider caching this value in the __init__ method.

Suggested change

out0, out1 = torch.ops._C_ascend.matmul_allreduce_add_rmsnorm(x, weight, residual, rms_norm_weight,

self.tp_group_name, 0, 0, self.eps, True, True)

out0, out1 = torch.ops._C_ascend.matmul_allreduce_add_rmsnorm(x, weight, residual, rms_norm_weight,

self.tp_group_name, get_tp_group().world_size, self.local_rank, self.eps, True, True)

gemini-code-assist · 2025-12-15T10:50:22Z

vllm_ascend/compilation/compiler_interface.py

+    print("=========torch compile graph=========")
+    print(graph.graph)
+


These print statements appear to be for debugging and should be removed from production code to keep logs clean.

gemini-code-assist · 2025-12-15T10:50:22Z

vllm_ascend/compilation/passes/allreduce_rmsnorm_fusion_pass.py

+        logging.info("=========before fusion graph========")
+        logging.info(graph.graph)
+        self.begin()
+        self.matched_count = self.pattern_match_passes.apply(graph)
+        logging.info("=========after fusion graph========")
+        logging.info(graph.graph)
+        logging.warning("Replaced %s patterns", self.matched_count)


The logging levels used here are too high. Printing entire graphs with logging.info can be excessively verbose for production environments and should be changed to logging.debug. Additionally, logging.warning should be reserved for potential problems, not for reporting a successful operation like pattern replacement, which should be logged at the info or debug level.

Suggested change

logging.info("=========before fusion graph========")

logging.info(graph.graph)

self.begin()

self.matched_count = self.pattern_match_passes.apply(graph)

logging.info("=========after fusion graph========")

logging.info(graph.graph)

logging.warning("Replaced %s patterns", self.matched_count)

logging.debug("=========before fusion graph========")

logging.debug(graph.graph)

self.begin()

self.matched_count = self.pattern_match_passes.apply(graph)

logging.debug("=========after fusion graph========")

logging.debug(graph.graph)

logging.info("Replaced %s patterns", self.matched_count)

github-actions · 2025-12-15T11:00:35Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: wxsIcey <[email protected]>

[Fusion] [Graph]Add Matmul Allreduce Rmsnorm fusion Pass

94a6271

Signed-off-by: wxsIcey <[email protected]>

gemini-code-assist bot reviewed Dec 15, 2025

View reviewed changes

github-actions bot added the module:core label Dec 15, 2025

fix

bf4d865

Signed-off-by: wxsIcey <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fusion] [Graph]Add Matmul Allreduce Rmsnorm fusion Pass #5034

[Fusion] [Graph]Add Matmul Allreduce Rmsnorm fusion Pass #5034

wxsIcey commented Dec 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

github-actions bot commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		out0, out1 = torch.ops._C_ascend.matmul_allreduce_add_rmsnorm(x, weight, residual, rms_norm_weight,
		self.tp_group_name, 0, 0, self.eps, True, True)

		print("=========torch compile graph=========")
		print(graph.graph)

[Fusion] [Graph]Add Matmul Allreduce Rmsnorm fusion Pass #5034

Are you sure you want to change the base?

[Fusion] [Graph]Add Matmul Allreduce Rmsnorm fusion Pass #5034

Conversation

wxsIcey commented Dec 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

wxsIcey commented Dec 15, 2025 •

edited by github-actions bot

Loading