-
Notifications
You must be signed in to change notification settings - Fork 661
[feat]pd disaggregated support cross-machine #5008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces multi-node KV cache transfer capabilities by implementing a handshake mechanism. Key changes include adding KVConnectorHandshakeMetadata and local_ip to MooncakeAgentMetadata, and remote_multi_nodes_meta_mapping to ReqMeta to facilitate cross-node metadata exchange. New methods get_handshake_metadata and set_xfer_handshake_metadata were added to MooncakeConnector and MooncakeConnectorScheduler respectively, along with a multi_nodes_meta_mapping attribute in the scheduler. The MooncakeConnectorWorker now generates and stores its own handshake metadata, including its local IP, and uses a new helper method _get_remote_host_info_by_port to resolve remote host information based on the exchanged metadata. The WorkerV1 class was updated to expose this KV connector handshake metadata. A review comment identified an issue where the default value for remote_multi_nodes_meta_mapping was incorrectly set to an integer 1 instead of an empty dictionary {}, which would lead to an AttributeError.
| remote_port=kv_transfer_params["remote_port"], | ||
| remote_pcp_size=kv_transfer_params.get("remote_pcp_size", 1), | ||
| remote_dcp_size=kv_transfer_params.get("remote_dcp_size", 1), | ||
| remote_multi_nodes_meta_mapping=kv_transfer_params.get("remote_multi_nodes_meta_mapping", 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value for remote_multi_nodes_meta_mapping is set to 1, which is incorrect for a parameter that is expected to be a dictionary. If the remote_multi_nodes_meta_mapping key is not present in kv_transfer_params, this will cause an AttributeError: 'int' object has no attribute 'get' in _get_remote_host_info_by_port when it tries to access the mapping. The default value should be an empty dictionary {}.
| remote_multi_nodes_meta_mapping=kv_transfer_params.get("remote_multi_nodes_meta_mapping", 1), | |
| remote_multi_nodes_meta_mapping=kv_transfer_params.get("remote_multi_nodes_meta_mapping", {}), |
|
Could you please add more explain of why you need this As I know, when prefill node crossing multi-node, each MooncakeConnectorScheduler will add it's own remote_host and remote_engine_id to kv_transfer_params, no matter master or slave node, which is set here. |
For DP cross-machine scenarios, there will be multiple instances, but for TP cross-machine scenarios (such as Ray cross-machine or MP cross-machine), there will only be one instance. In this case, the information of the master and slave nodes needs to be sent to the D node. |
I got it. But the problem is only existing on Ray cross-machine scenario with only one dp rank like pure TP? While using MP cross-machine, each node will have a DPEnginecore with a MooncakeConnectorScheduler, it will follow kv_transfer_params, each node can set its own remote_host. Could please confirm if I am right or wrong? |
|
For the DP cross-machine, each node will have a DPEnginecore with a MooncakeConnectorScheduler, it will follow kv_transfer_params, each node can set its own remote_host.The current code is compatible with this scenario. |
Got it. Thanks for your explanation. |
Signed-off-by: weiguihua2 <[email protected]>
Signed-off-by: weiguihua2 <[email protected]>
Signed-off-by: weiguihua2 <[email protected]>
What this PR does / why we need it?
pd disaggregated support cross-machine.
We send the primary and secondary node information of node p to node d. When node d pulls the KV data, it retrieves the corresponding primary or secondary node information from the mapping.
Does this PR introduce any user-facing change?
How was this patch tested?