Skip to content

Commit 8ee2355

Browse files
authored
misc: update README with environment variable instructions and vLLM version specified (#61)
1 parent 44d5670 commit 8ee2355

File tree

1 file changed

+8
-5
lines changed

1 file changed

+8
-5
lines changed

README.md

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -75,17 +75,15 @@ Use the flexible P2P implementation, notice this will install `mooncake-transfer
7575
pip install 'checkpoint-engine[p2p]'
7676
```
7777

78-
If set `NCCL_IB_HCA` env, checkpoint-engine will use it to auto select net devices for different ranks. Available patterns can be found from [NCCL documentation](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html#id8). If not set, it will read all RDMA devices and try to divide them into each rank.
79-
8078
## Getting Started
8179

82-
Prepare an H800 or H20 machine with 8 GPUs with latest vLLM. Be sure to include [/collective_rpc API endpoint](https://github.com/vllm-project/vllm/commit/f7cf5b512ee41f36613deb2471a44de5f304f70d) commit (available in main branch) since checkpoint-engine will use this endpoint to update weights.
80+
Prepare an H800 or H20 machine with 8 GPUs with vLLM. Be sure to include [/collective_rpc API endpoint](https://github.com/vllm-project/vllm/commit/f7cf5b512ee41f36613deb2471a44de5f304f70d) commit (available in main branch) since checkpoint-engine will use this endpoint to update weights. vLLM version `v0.10.2` is fully tested and recommended.
8381

8482
```Bash
85-
cd /opt && git clone https://github.com/vllm-project/vllm && cd vllm
83+
mkdir -p /opt/vLLM && cd /opt/vLLM
8684
uv venv --python 3.12 --seed
8785
source .venv/bin/activate
88-
VLLM_USE_PRECOMPILED=1 uv pip install --editable .
86+
uv pip install vllm==0.10.2
8987
```
9088

9189
Install checkpoint-engine
@@ -156,6 +154,11 @@ Other unit tests can also be done with pytest. Only test_update.py requires GPUs
156154
pytest tests/ -m "not gpu"
157155
```
158156

157+
### Environment Variables
158+
- `PS_MAX_BUCKET_SIZE_GB`: An integer is used to set the maximum bucket size for checkpoint-engine. If not set, 8GB is used as default.
159+
- `PS_P2P_STORE_RDMA_DEVICES`: Comma-separated RDMA devices' names for P2P transfer. If not set, checkpoint-engine will fall back to use `NCCL_IB_HCA` to detect RDMA devices.
160+
- `NCCL_IB_HCA`: Available patterns can be found from [NCCL documentation](https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html#id8). If also not set, all RDMA devices will be used and divided evenly among the ranks.
161+
159162
## SGLang Integration
160163

161164
Checkpoint Engine provides efficient distributed checkpoint loading for SGLang inference servers, significantly reducing model loading time for large models and multi-node setups.

0 commit comments

Comments
 (0)