Skip to content

Commit bdab3a7

Browse files
committed
feat: update README.md with environment variable instructions and vllm == 0.10.2 installation
1 parent 44d5670 commit bdab3a7

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -75,17 +75,12 @@ Use the flexible P2P implementation, notice this will install `mooncake-transfer
7575
pip install 'checkpoint-engine[p2p]'
7676
```
7777

78-
If set `NCCL_IB_HCA` env, checkpoint-engine will use it to auto select net devices for different ranks. Available patterns can be found from [NCCL documentation](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html#id8). If not set, it will read all RDMA devices and try to divide them into each rank.
79-
8078
## Getting Started
8179

82-
Prepare an H800 or H20 machine with 8 GPUs with latest vLLM. Be sure to include [/collective_rpc API endpoint](https://github.com/vllm-project/vllm/commit/f7cf5b512ee41f36613deb2471a44de5f304f70d) commit (available in main branch) since checkpoint-engine will use this endpoint to update weights.
80+
Prepare an H800 or H20 machine with 8 GPUs with vLLM. Be sure to include [/collective_rpc API endpoint](https://github.com/vllm-project/vllm/commit/f7cf5b512ee41f36613deb2471a44de5f304f70d) commit (available in main branch) since checkpoint-engine will use this endpoint to update weights. vLLM version `v0.10.2` is fully tested and recommended.
8381

8482
```Bash
85-
cd /opt && git clone https://github.com/vllm-project/vllm && cd vllm
86-
uv venv --python 3.12 --seed
87-
source .venv/bin/activate
88-
VLLM_USE_PRECOMPILED=1 uv pip install --editable .
83+
uv pip install vllm==0.10.2
8984
```
9085

9186
Install checkpoint-engine
@@ -156,6 +151,11 @@ Other unit tests can also be done with pytest. Only test_update.py requires GPUs
156151
pytest tests/ -m "not gpu"
157152
```
158153

154+
### Environment Variables
155+
- `PS_MAX_BUCKET_SIZE_GB`: An integer is used to set the maximum bucket size for checkpoint-engine. If not set, `8GB` is used as default.
156+
- `PS_P2P_STORE_RDMA_DEVICES`: Comma-separated RDMA devices' names for P2P transfer. If not set, checkpoint-engine will fall back to use `NCCL_IB_HCA` to detect RDMA devices.
157+
- `NCCL_IB_HCA`: Available patterns can be found from [NCCL documentation](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html#id8). If both not set, all RDMA devices will be used and divided evenly among the ranks.
158+
159159
## SGLang Integration
160160

161161
Checkpoint Engine provides efficient distributed checkpoint loading for SGLang inference servers, significantly reducing model loading time for large models and multi-node setups.

0 commit comments

Comments
 (0)