You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to hardware differences in Ascend devices, the method for running the Checkpoint Engine on Ascend platforms requires specific adaptations.
6
+
7
+
## Environment
8
+
9
+
To support features like IPC Buffer and Transfer Engine, the following Ascend software versions are required:
10
+
11
+
| Software | version |
12
+
|-------------|-------------|
13
+
| Ascend HDK |\>=25.3.rc1 |
14
+
| cann | \>=8.3.RC1 | <!-- codespell:ignore -->
15
+
| python | 3.11 |
16
+
| torch | 2.7.1 |
17
+
| torch_npu | 2.7.1 |
18
+
| vllm | 0.11.0 |
19
+
| vllm_ascend | 0.11.0rc0 |
20
+
21
+
## Installation
22
+
23
+
Install from src:
24
+
```shell
25
+
pip install -e .
26
+
```
27
+
Using the flexible P2P implementation requires installation of the Transfer Engine. However, ascend device cannot install transfer engine via pip, requires source compilation.
28
+
29
+
Reference document: [Ascend Direct Transport documentation](https://github.com/kvcache-ai/Mooncake/blob/main/doc/en/ascend_direct_transport.md)
30
+
31
+
32
+
## Deploy vLLM Service
33
+
34
+
Since HCCL uses the default port 16666, when executing single-device multi-process tasks, you need to manually assign port to the processes.
35
+
Additionally, the underlying HIXL used by the Transfer Engine also defaults to port 16666 during link establishment, and currently there is no interface to modify this. Therefore, when Deploying vLLM serve, you must manually specify the port for the device via the ranktable file.
36
+
37
+
**ranktable file example:**
38
+
```
39
+
{
40
+
"version": "1.0",
41
+
"server_count": "2",
42
+
"server_list": [
43
+
{
44
+
"server_id": "server1",
45
+
"device": [
46
+
{
47
+
"device_id": "0",
48
+
"device_ip": "ip1",
49
+
"device_port": "23333", // Choose an available port other than 16666
50
+
"rank_id": "0"
51
+
},
52
+
{
53
+
"device_id": "1",
54
+
"device_ip": "ip2",
55
+
"device_port": "23333",
56
+
"rank_id": "1"
57
+
}...
58
+
]
59
+
},
60
+
{
61
+
"server_id": "server2",
62
+
"device": [
63
+
{
64
+
"device_id": "0",
65
+
"device_ip": "ip8",
66
+
"device_port": "23333",
67
+
"rank_id": "8"
68
+
}...
69
+
]
70
+
}...
71
+
]
72
+
}
73
+
```
74
+
75
+
Set the `RANK_TABLE_FILE` environment variable when starting vLLM.
1. Set the `ASCEND_RT_VISIBLE_DEVICES` environment variable according to the actual NPUs in use. Failure to do so will cause host quantity validation to fail in P2P mode.
0 commit comments