[feat]pd disaggregated support cross-machine #5008

weiguihua2 · 2025-12-15T02:40:32Z

What this PR does / why we need it?

pd disaggregated support cross-machine.
We send the primary and secondary node information of node p to node d. When node d pulls the KV data, it retrieves the corresponding primary or secondary node information from the mapping.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.12.0
vLLM main: vllm-project/vllm@ad32e3e

github-actions · 2025-12-15T02:40:42Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces multi-node KV cache transfer capabilities by implementing a handshake mechanism. Key changes include adding KVConnectorHandshakeMetadata and local_ip to MooncakeAgentMetadata, and remote_multi_nodes_meta_mapping to ReqMeta to facilitate cross-node metadata exchange. New methods get_handshake_metadata and set_xfer_handshake_metadata were added to MooncakeConnector and MooncakeConnectorScheduler respectively, along with a multi_nodes_meta_mapping attribute in the scheduler. The MooncakeConnectorWorker now generates and stores its own handshake metadata, including its local IP, and uses a new helper method _get_remote_host_info_by_port to resolve remote host information based on the exchanged metadata. The WorkerV1 class was updated to expose this KV connector handshake metadata. A review comment identified an issue where the default value for remote_multi_nodes_meta_mapping was incorrectly set to an integer 1 instead of an empty dictionary {}, which would lead to an AttributeError.

gemini-code-assist · 2025-12-15T02:41:52Z

vllm_ascend/distributed/mooncake_connector.py

            remote_port=kv_transfer_params["remote_port"],
            remote_pcp_size=kv_transfer_params.get("remote_pcp_size", 1),
            remote_dcp_size=kv_transfer_params.get("remote_dcp_size", 1),
+            remote_multi_nodes_meta_mapping=kv_transfer_params.get("remote_multi_nodes_meta_mapping", 1),


The default value for remote_multi_nodes_meta_mapping is set to 1, which is incorrect for a parameter that is expected to be a dictionary. If the remote_multi_nodes_meta_mapping key is not present in kv_transfer_params, this will cause an AttributeError: 'int' object has no attribute 'get' in _get_remote_host_info_by_port when it tries to access the mapping. The default value should be an empty dictionary {}.

Suggested change

remote_multi_nodes_meta_mapping=kv_transfer_params.get("remote_multi_nodes_meta_mapping", 1),

remote_multi_nodes_meta_mapping=kv_transfer_params.get("remote_multi_nodes_meta_mapping", {}),

lidenghui1110 · 2025-12-15T06:50:00Z

Could you please add more explain of why you need this remote_multi_nodes_meta_mapping?

As I know, when prefill node crossing multi-node, each MooncakeConnectorScheduler will add it's own remote_host and remote_engine_id to kv_transfer_params, no matter master or slave node, which is set here.

weiguihua2 · 2025-12-15T08:45:52Z

Could you please add more explain of why you need this remote_multi_nodes_meta_mapping?

As I know, when prefill node crossing multi-node, each MooncakeConnectorScheduler will add it's own remote_host and remote_engine_id to kv_transfer_params, no matter master or slave node, which is set here.

For DP cross-machine scenarios, there will be multiple instances, but for TP cross-machine scenarios (such as Ray cross-machine or MP cross-machine), there will only be one instance. In this case, the information of the master and slave nodes needs to be sent to the D node.

lidenghui1110 · 2025-12-15T09:22:06Z

Could you please add more explain of why you need this remote_multi_nodes_meta_mapping?
As I know, when prefill node crossing multi-node, each MooncakeConnectorScheduler will add it's own remote_host and remote_engine_id to kv_transfer_params, no matter master or slave node, which is set here.

For DP cross-machine scenarios, there will be multiple instances, but for TP cross-machine scenarios (such as Ray cross-machine or MP cross-machine), there will only be one instance. In this case, the information of the master and slave nodes needs to be sent to the D node.

I got it. But the problem is only existing on Ray cross-machine scenario with only one dp rank like pure TP? While using MP cross-machine, each node will have a DPEnginecore with a MooncakeConnectorScheduler, it will follow kv_transfer_params, each node can set its own remote_host.

Could please confirm if I am right or wrong?

weiguihua2 · 2025-12-15T09:35:32Z

Could you please add more explain of why you need this remote_multi_nodes_meta_mapping?
As I know, when prefill node crossing multi-node, each MooncakeConnectorScheduler will add it's own remote_host and remote_engine_id to kv_transfer_params, no matter master or slave node, which is set here.

For DP cross-machine scenarios, there will be multiple instances, but for TP cross-machine scenarios (such as Ray cross-machine or MP cross-machine), there will only be one instance. In this case, the information of the master and slave nodes needs to be sent to the D node.

I got it. But the problem is only existing on Ray cross-machine scenario with only one dp rank like pure TP? While using MP cross-machine, each node will have a DPEnginecore with a MooncakeConnectorScheduler, it will follow kv_transfer_params, each node can set its own remote_host.

Could please confirm if I am right or wrong?
The community merged a new feature a few weeks ago, which is a TP cross-machine not based on Ray; there is only one DPEnginecore with a MooncakeConnectorScheduler across multiple nodes.
vllm-project/vllm#23691

weiguihua2 · 2025-12-15T09:56:17Z

Could you please add more explain of why you need this remote_multi_nodes_meta_mapping?
As I know, when prefill node crossing multi-node, each MooncakeConnectorScheduler will add it's own remote_host and remote_engine_id to kv_transfer_params, no matter master or slave node, which is set here.

For DP cross-machine scenarios, there will be multiple instances, but for TP cross-machine scenarios (such as Ray cross-machine or MP cross-machine), there will only be one instance. In this case, the information of the master and slave nodes needs to be sent to the D node.

I got it. But the problem is only existing on Ray cross-machine scenario with only one dp rank like pure TP? While using MP cross-machine, each node will have a DPEnginecore with a MooncakeConnectorScheduler, it will follow kv_transfer_params, each node can set its own remote_host.
Could please confirm if I am right or wrong?
The community merged a new feature a few weeks ago, which is a TP cross-machine not based on Ray; there is only one DPEnginecore with a MooncakeConnectorScheduler across multiple nodes.
vllm-project/vllm#23691

For the DP cross-machine, each node will have a DPEnginecore with a MooncakeConnectorScheduler, it will follow kv_transfer_params, each node can set its own remote_host.The current code is compatible with this scenario.

lidenghui1110 · 2025-12-15T10:58:37Z

Could you please add more explain of why you need this remote_multi_nodes_meta_mapping?
As I know, when prefill node crossing multi-node, each MooncakeConnectorScheduler will add it's own remote_host and remote_engine_id to kv_transfer_params, no matter master or slave node, which is set here.

For DP cross-machine scenarios, there will be multiple instances, but for TP cross-machine scenarios (such as Ray cross-machine or MP cross-machine), there will only be one instance. In this case, the information of the master and slave nodes needs to be sent to the D node.

I got it. But the problem is only existing on Ray cross-machine scenario with only one dp rank like pure TP? While using MP cross-machine, each node will have a DPEnginecore with a MooncakeConnectorScheduler, it will follow kv_transfer_params, each node can set its own remote_host.
Could please confirm if I am right or wrong?
The community merged a new feature a few weeks ago, which is a TP cross-machine not based on Ray; there is only one DPEnginecore with a MooncakeConnectorScheduler across multiple nodes.
vllm-project/vllm#23691

For the DP cross-machine, each node will have a DPEnginecore with a MooncakeConnectorScheduler, it will follow kv_transfer_params, each node can set its own remote_host.The current code is compatible with this scenario.

Got it. Thanks for your explanation.

Signed-off-by: weiguihua2 <[email protected]>

gemini-code-assist bot reviewed Dec 15, 2025

View reviewed changes

[feat]pd disaggregated support cross-machine

5e5c90d

Signed-off-by: weiguihua2 <[email protected]>

weiguihua2 force-pushed the new_main branch from d35373f to 5e5c90d Compare December 15, 2025 12:45

weiguihua2 added 2 commits December 15, 2025 21:32

[feat]pd disaggregated support cross-machine

59b6bca

Signed-off-by: weiguihua2 <[email protected]>

[feat]pd disaggregated support cross-machine

b2d00b2

Signed-off-by: weiguihua2 <[email protected]>

weiguihua2 added pd-test enable pd test for PR ready-for-test start test by label for PR ready read for review and removed pd-test enable pd test for PR labels Dec 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feat]pd disaggregated support cross-machine #5008

[feat]pd disaggregated support cross-machine #5008

weiguihua2 commented Dec 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 15, 2025

Uh oh!

lidenghui1110 commented Dec 15, 2025 •

edited

Loading

Uh oh!

weiguihua2 commented Dec 15, 2025

Uh oh!

lidenghui1110 commented Dec 15, 2025

Uh oh!

weiguihua2 commented Dec 15, 2025

Uh oh!

weiguihua2 commented Dec 15, 2025

Uh oh!

lidenghui1110 commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	remote_multi_nodes_meta_mapping=kv_transfer_params.get("remote_multi_nodes_meta_mapping", 1),
	remote_multi_nodes_meta_mapping=kv_transfer_params.get("remote_multi_nodes_meta_mapping", {}),

[feat]pd disaggregated support cross-machine #5008

Are you sure you want to change the base?

[feat]pd disaggregated support cross-machine #5008

Conversation

weiguihua2 commented Dec 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Dec 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

lidenghui1110 commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

weiguihua2 commented Dec 15, 2025

Uh oh!

lidenghui1110 commented Dec 15, 2025

Uh oh!

weiguihua2 commented Dec 15, 2025

Uh oh!

weiguihua2 commented Dec 15, 2025

Uh oh!

lidenghui1110 commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

weiguihua2 commented Dec 15, 2025 •

edited by github-actions bot

Loading

lidenghui1110 commented Dec 15, 2025 •

edited

Loading