Skip to content

Commit f7a1016

Browse files
committed
Add Mooncake store support for vLLM backend
Extends mooncake_master orchestration (previously SGLang-only) to the vLLM backend so vLLM workers can use the in-process MooncakeStore connector for cross-process KV sharing. What's new ---------- * `VLLMMooncakeKVStoreConfig` schema (`backends/vllm.py`): mirrors the SGLang `mooncake_kv_store:` block, plus a vLLM-specific `store_config:` section. srtslurm renders that section into the `MOONCAKE_CONFIG_PATH` JSON file vLLM's `MooncakeStoreConnector` reads on startup (`MooncakeStoreConfig.load_from_env()`); the `env:` map is injected on every vLLM worker for the in-process `MC_*` knobs (e.g. `MC_ENABLE_DEST_DEVICE_AFFINITY`, `MC_STORE_CLIENT_METRIC`, `MC_TE_METRIC`). * `start_mooncake_master` in `cli/do_sweep.py` now fires for vLLM as well as SGLang, writes the per-job store JSON next to the other log artifacts, and stamps the master RPC + HTTP-metadata endpoints on each worker. Shared launch constants moved to a new `backends/mooncake.py` (the SGLang module re-exports them so downstream imports keep working). * Mooncake admin HTTP server (`/metrics`, `/health`, `/role`, `/ha_status`, `/leader`, `/query_key`) is now wired up explicitly: the master srun passes `--enable_metric_reporting=true --metrics_port=9003` (upstream default in master.cpp), and `start_mooncake_master` waits on the metrics port like it already waits on RPC + HTTP-metadata. The flag toggles a periodic stdout log thread only — `MasterAdminServer` listens unconditionally — the new constant's comment documents that to prevent future reordering. * Docs (`docs/mooncake-kv-store.md`): adds a vLLM quick start, a vLLM-specific configuration reference (including `store_config`), expands the ownership table for the auto-stamped vars, and documents the new master metrics endpoint. Validation ---------- * Manual: launched a 1P/1D disagg job; master came up with `enable_metric_reporting=1, metrics_port=9003`; vLLM workers picked up `MOONCAKE_CONFIG_PATH` and registered RDMA segments. `curl :9003/metrics` returned a full Prometheus dump. * Tests: `make check` (ruff + pytest). New cases in `tests/test_e2e.py` cover the vLLM mooncake env injection and JSON rendering; `tests/test_dry_run.py` cases cover the new config surface in `srtctl dry-run` output. Signed-off-by: inf-yasong <yasong.wang@inferact.ai>
1 parent 9cd8fe8 commit f7a1016

9 files changed

Lines changed: 686 additions & 46 deletions

File tree

docs/mooncake-kv-store.md

Lines changed: 79 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
11
# Mooncake KV Store
22

3-
First-class support for [Mooncake](https://github.com/kvcache-ai/Mooncake) as the KV transfer backend for SGLang prefill-decode disaggregation. When `mooncake_kv_store` is set under an SGLang backend, srtslurm launches and configures the mooncake master automatically and wires up worker env vars so peer-to-peer transfers work across multiple nodes.
3+
First-class support for [Mooncake](https://github.com/kvcache-ai/Mooncake) as the KV transfer backend for prefill-decode disaggregation. When `mooncake_kv_store` is set under an SGLang or vLLM backend, srtslurm launches and configures the mooncake master automatically and wires up worker env vars so peer-to-peer transfers work across multiple nodes.
44

55
## Table of Contents
66

77
- [Overview](#overview)
8-
- [Quick Start](#quick-start)
8+
- [Quick Start (SGLang)](#quick-start-sglang)
9+
- [Quick Start (vLLM)](#quick-start-vllm)
910
- [What srtslurm Owns vs What You Set](#what-srtslurm-owns-vs-what-you-set)
1011
- [Configuration Reference](#configuration-reference)
12+
- [Master Metrics Endpoint](#master-metrics-endpoint)
1113
- [Validation](#validation)
1214
- [Common Configurations](#common-configurations)
1315
- [RDMA / InfiniBand](#rdma--infiniband)
@@ -30,7 +32,7 @@ Without first-class support, running mooncake with srtslurm meant:
3032

3133
The `mooncake_kv_store` block automates 1–3. You still set the SGLang flags in step 4 because they're SGLang's CLI surface, not srtslurm's — but srtslurm validates that you did.
3234

33-
## Quick Start
35+
## Quick Start (SGLang)
3436

3537
Minimum config to run mooncake:
3638

@@ -63,19 +65,53 @@ backend:
6365
disaggregation-transfer-backend: mooncake
6466
```
6567
68+
## Quick Start (vLLM)
69+
70+
vLLM's `MooncakeStoreConnector` reads its configuration from a JSON file pointed to by `MOONCAKE_CONFIG_PATH` rather than directly from env vars, so the vLLM block takes an extra `store_config:` section that srtslurm renders into that JSON at job start:
71+
72+
```yaml
73+
backend:
74+
type: vllm
75+
mooncake_kv_store:
76+
env: # injected on every vLLM worker
77+
MOONCAKE_PROTOCOL: rdma
78+
MC_ENABLE_DEST_DEVICE_AFFINITY: "1"
79+
store_config: # → MOONCAKE_CONFIG_PATH JSON
80+
metadata_server: "P2PHANDSHAKE"
81+
global_segment_size: "50GB"
82+
local_buffer_size: "4GB"
83+
protocol: "rdma"
84+
device_name: "mlx5_0,mlx5_1"
85+
vllm_config:
86+
prefill:
87+
kv-transfer-config: '{"kv_connector":"MooncakeStoreConnector","kv_role":"kv_producer"}'
88+
decode:
89+
kv-transfer-config: '{"kv_connector":"MooncakeStoreConnector","kv_role":"kv_consumer"}'
90+
```
91+
92+
srtslurm stamps `MOONCAKE_MASTER`, `MOONCAKE_TE_META_DATA_SERVER`, `MOONCAKE_LOCAL_HOSTNAME`, and `MOONCAKE_CONFIG_PATH` on every worker; you supply the rest. `master_server_address` in `store_config` is also auto-filled from the infra node IP and ignored if set by hand.
93+
94+
The `env:` map is injected on every vLLM worker (not on the standalone `mooncake_master` daemon — the master srun passes no env). Use it for in-process Mooncake C++ knobs like `MC_ENABLE_DEST_DEVICE_AFFINITY`, `MC_STORE_CLIENT_METRIC`, `MC_TE_METRIC`.
95+
6696
## What srtslurm Owns vs What You Set
6797

6898
| Concern | Owner | Notes |
6999
| ----------------------------------------------- | --------- | ---------------------------------------------------------------------------------------------------- |
70-
| Launching `mooncake_master` | srtslurm | Runs on the infra node (same node as etcd/nats; respects `infra.etcd_nats_dedicated_node`). Port 50051. |
100+
| Launching `mooncake_master` | srtslurm | Runs on the infra node (same node as etcd/nats; respects `infra.etcd_nats_dedicated_node`). RPC `50051`, HTTP metadata `8080`, admin HTTP `9003`. |
71101
| `MOONCAKE_MASTER` env var on workers | srtslurm | Always computed as `<infra_node_ip>:50051`. User values in `env` are overridden. |
102+
| `MOONCAKE_TE_META_DATA_SERVER` env var | srtslurm | Always computed as `http://<infra_node_ip>:8080/metadata`. |
72103
| `MOONCAKE_LOCAL_HOSTNAME` env var | srtslurm | Auto-resolved per-worker via `runtime.network_interface`. User can override in `env` for custom NICs. |
104+
| `MOONCAKE_CONFIG_PATH` (vLLM only) | srtslurm | Always points to the JSON file srtslurm renders from `store_config:`. Mounted under `/logs` in every worker. |
105+
| `master_server_address` in `store_config` (vLLM)| srtslurm | Always overridden with `<infra_node_ip>:50051`. User values are ignored. |
73106
| `MOONCAKE_PROTOCOL`, `MOONCAKE_DEVICE`, etc. | User | Passed through `mooncake_kv_store.env` to all workers. |
74-
| `disaggregation-transfer-backend: mooncake` | User | Set on `sglang_config.prefill` and `sglang_config.decode`. srtslurm validates this is present. |
75-
| `disaggregation-ib-device` | User | Set on `sglang_config.prefill` and `sglang_config.decode`. Format: `"mlx5_0,mlx5_1"` or JSON map. |
107+
| `disaggregation-transfer-backend: mooncake` | User | (SGLang only) Set on `sglang_config.prefill` and `sglang_config.decode`. srtslurm validates this is present. |
108+
| `disaggregation-ib-device` | User | (SGLang only) Set on `sglang_config.prefill` and `sglang_config.decode`. Format: `"mlx5_0,mlx5_1"` or JSON map. |
109+
| `kv-transfer-config` | User | (vLLM only) Set on `vllm_config.prefill` and `vllm_config.decode` to wire vLLM's `MooncakeStoreConnector`. |
76110

77111
## Configuration Reference
78112

113+
### SGLang
114+
79115
```yaml
80116
backend:
81117
type: sglang
@@ -93,10 +129,45 @@ backend:
93129
SGLANG_DISAGG_STAGING_POOL_SIZE_MB: "4096"
94130
```
95131

132+
### vLLM
133+
134+
```yaml
135+
backend:
136+
type: vllm
137+
mooncake_kv_store:
138+
container: ... # optional, default: job container
139+
env: # optional, injected on every vLLM worker
140+
MOONCAKE_PROTOCOL: rdma
141+
MC_ENABLE_DEST_DEVICE_AFFINITY: "1"
142+
MC_STORE_CLIENT_METRIC: "1" # default 1 (enabled)
143+
MC_TE_METRIC: "0" # default 0 (disabled)
144+
store_config: # optional, rendered into MOONCAKE_CONFIG_PATH JSON
145+
metadata_server: "P2PHANDSHAKE" # default "P2PHANDSHAKE"
146+
global_segment_size: "4GB" # default "4GB"
147+
local_buffer_size: "4GB" # default "4GB"
148+
protocol: "rdma" # default "rdma"
149+
device_name: "mlx5_0,mlx5_1" # default ""
150+
```
151+
96152
### Fields
97153

98-
- **`container`** (`str`, optional): Container image used for the `mooncake_master` srun. Defaults to the job container if unset. Useful when mooncake needs a different runtime than your SGLang container.
99-
- **`env`** (`dict[str, str]`, optional): Pass-through env vars injected on every prefill and decode worker. Keys map directly to mooncake's environment variable names — see the [SGLang server_args.py](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/environ.py) and [mooncake_store.py](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/mem_cache/storage/mooncake_store/mooncake_store.py) for the full list. Setting `MOONCAKE_MASTER` here is a no-op (srtslurm always wins).
154+
- **`container`** (`str`, optional): Container image used for the `mooncake_master` srun. Defaults to the job container if unset. Useful when mooncake needs a different runtime than your worker container.
155+
- **`env`** (`dict[str, str]`, optional): Pass-through env vars injected on every prefill and decode worker.
156+
- For **SGLang**, keys map directly to mooncake's environment variable names — see the [SGLang server_args.py](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/environ.py) and [mooncake_store.py](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/mem_cache/storage/mooncake_store/mooncake_store.py) for the full list.
157+
- For **vLLM**, this is for in-process Mooncake C++ knobs (`MC_*`) read by the transfer engine / store client. vLLM's connector itself reads configuration from `MOONCAKE_CONFIG_PATH` (the JSON rendered from `store_config:`), not from these env vars.
158+
- Setting `MOONCAKE_MASTER`, `MOONCAKE_TE_META_DATA_SERVER`, or `MOONCAKE_CONFIG_PATH` here is a no-op (srtslurm always wins).
159+
- **`store_config`** (vLLM only, `dict[str, str]`, optional): Rendered as JSON into the file pointed to by `MOONCAKE_CONFIG_PATH`. Keys map 1:1 to vLLM's `MooncakeStoreConfig` dataclass. `master_server_address` is auto-filled and any user value is ignored.
160+
161+
## Master Metrics Endpoint
162+
163+
The `mooncake_master` admin HTTP server is always exposed on port `9003` on the infra node and starts before workers do (srtslurm waits for it). It serves:
164+
165+
- `GET /metrics` — Prometheus text format (master + transfer-engine counters)
166+
- `GET /metrics/summary` — human-readable summary
167+
- `GET /health`, `/role`, `/ha_status`, `/leader`
168+
- `GET /query_key` — used by Dynamo's KV router shared-cache path
169+
170+
To scrape from outside the cluster, point your collector at `http://<infra_node_ip>:9003/metrics`. The infra node IP is logged at job start.
100171

101172
## Validation
102173

src/srtctl/backends/mooncake.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
"""Shared mooncake_master constants used by both SGLang and vLLM backends.
5+
6+
Kept in a dedicated module (rather than re-exported from one backend) so
7+
neither backend has to import from the other just to reach the port numbers.
8+
"""
9+
10+
# RPC port the master listens on. Workers reach it via MOONCAKE_MASTER.
11+
MOONCAKE_MASTER_PORT = 50051
12+
13+
# Port for the master's embedded HTTP metadata server (enabled with
14+
# --enable_http_metadata_server=true). Workers point MOONCAKE_TE_META_DATA_SERVER
15+
# at /metadata on this port so no separate metadata service is required.
16+
MOONCAKE_HTTP_METADATA_PORT = 8080
17+
18+
# Port for the master's admin HTTP server. Matches the upstream default in
19+
# mooncake-store/src/master.cpp (--metrics_port=9003). Always listens once the
20+
# master is up — --enable_metric_reporting only toggles a periodic stdout log
21+
# thread, not the HTTP endpoints. Exposes /metrics (Prometheus text),
22+
# /metrics/summary, /health, /role, /ha_status, /leader, /query_key.
23+
MOONCAKE_METRICS_PORT = 9003

src/srtctl/backends/sglang.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,10 @@
2222
from marshmallow import Schema
2323
from marshmallow_dataclass import dataclass
2424

25+
# Re-exported so existing `from srtctl.backends.sglang import MOONCAKE_*` paths keep working.
26+
# Canonical home is srtctl.backends.mooncake.
27+
from srtctl.backends.mooncake import MOONCAKE_HTTP_METADATA_PORT, MOONCAKE_MASTER_PORT # noqa: F401
28+
2529
if TYPE_CHECKING:
2630
from srtctl.backends.base import SrunConfig
2731
from srtctl.core.runtime import RuntimeContext
@@ -30,9 +34,6 @@
3034
# Type alias for worker modes
3135
WorkerMode = Literal["prefill", "decode", "agg"]
3236

33-
MOONCAKE_MASTER_PORT = 50051
34-
MOONCAKE_HTTP_METADATA_PORT = 8080
35-
3637

3738
@dataclass(frozen=True)
3839
class MooncakeKVStoreConfig:

src/srtctl/backends/vllm.py

Lines changed: 123 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,78 @@
2424
from marshmallow import Schema
2525
from marshmallow_dataclass import dataclass
2626

27+
# vLLM reuses the same mooncake_master launch command and port pair as SGLang.
28+
# Disjoint port pairs can be reintroduced if we ever need to colocate an SGLang
29+
# and vLLM job on the same infra node.
30+
from srtctl.backends.mooncake import MOONCAKE_HTTP_METADATA_PORT, MOONCAKE_MASTER_PORT
31+
2732
if TYPE_CHECKING:
2833
from srtctl.backends.base import SrunConfig
2934
from srtctl.core.runtime import RuntimeContext
30-
from srtctl.core.topology import Endpoint, NodePortAllocator, Process
35+
from srtctl.core.topology import Endpoint, Process
3136

3237
# Type alias for worker modes
3338
WorkerMode = Literal["prefill", "decode", "agg"]
3439

40+
# Filename for the mooncake-store JSON config srtslurm writes to log_dir at job
41+
# start. log_dir is mounted into every worker at /logs, so workers read the JSON
42+
# from MOONCAKE_STORE_CONFIG_CONTAINER_PATH.
43+
MOONCAKE_STORE_CONFIG_FILENAME = "mooncake_store_config.json"
44+
MOONCAKE_STORE_CONFIG_CONTAINER_PATH = f"/logs/{MOONCAKE_STORE_CONFIG_FILENAME}"
45+
46+
47+
@dataclass(frozen=True)
48+
class VLLMMooncakeKVStoreConfig:
49+
"""Mooncake KV store config for the vLLM backend.
50+
51+
When present, srtslurm launches ``mooncake_master`` on the infra node
52+
(co-located with etcd/nats) using the shared SGLang launch command and
53+
injects on every vLLM worker::
54+
55+
MOONCAKE_MASTER = <infra_ip>:50051
56+
MOONCAKE_TE_META_DATA_SERVER = http://<infra_ip>:8080/metadata
57+
MOONCAKE_LOCAL_HOSTNAME = <worker_ip>
58+
MOONCAKE_CONFIG_PATH = /logs/mooncake_store_config.json
59+
60+
The JSON file referenced by ``MOONCAKE_CONFIG_PATH`` is generated by
61+
srtslurm at job start (see ``do_sweep.start_mooncake_master``) from
62+
``store_config`` below. vLLM's ``MooncakeStoreConnector`` reads this
63+
file via ``MooncakeStoreConfig.load_from_env()``.
64+
65+
``env:`` is injected on every vLLM worker (alongside the auto-stamped
66+
``MOONCAKE_*`` vars above), not on the standalone ``mooncake_master``
67+
daemon — the master srun command passes no env. Use this for
68+
in-process Mooncake C++ libraries linked into the worker:
69+
``MC_*`` knobs read by the transfer engine / store client
70+
(e.g. ``MC_ENABLE_DEST_DEVICE_AFFINITY``, ``MC_STORE_CLIENT_METRIC``,
71+
``MC_TE_METRIC``), and any ``MOONCAKE_*`` overrides the connector
72+
consults.
73+
74+
Example YAML::
75+
76+
backend:
77+
type: vllm
78+
mooncake_kv_store:
79+
container: inferactinc/public:mk-int-20260507 # optional
80+
env: # injected on every worker
81+
MOONCAKE_PROTOCOL: rdma
82+
MOONCAKE_GLOBAL_SEGMENT_SIZE: "4gb"
83+
MOONCAKE_DEVICE: mlx5_0
84+
store_config: # MooncakeStoreConfig JSON keys
85+
metadata_server: "P2PHANDSHAKE"
86+
global_segment_size: "100GB"
87+
local_buffer_size: "4GB"
88+
protocol: "rdma"
89+
device_name: ""
90+
# master_server_address: srtslurm auto-fills from infra IP
91+
"""
92+
93+
container: str | None = None
94+
env: dict[str, str] = field(default_factory=dict)
95+
store_config: dict[str, str] | None = None
96+
97+
Schema: ClassVar[builtins.type[Schema]] = Schema
98+
3599

36100
@dataclass(frozen=True)
37101
class VLLMServerConfig:
@@ -91,6 +155,11 @@ class VLLMProtocol:
91155
# dynamo 1.0.0+: translated to --kv-transfer-config (--connector was removed).
92156
connector: str | None = "nixl"
93157

158+
# Mooncake KV store — when set, srtslurm launches mooncake_master on the
159+
# infra node and auto-injects MOONCAKE_MASTER / MOONCAKE_TE_META_DATA_SERVER
160+
# / MOONCAKE_LOCAL_HOSTNAME on every vLLM worker.
161+
mooncake_kv_store: VLLMMooncakeKVStoreConfig | None = None
162+
94163
Schema: ClassVar[builtins.type[Schema]] = Schema
95164

96165
# =========================================================================
@@ -132,14 +201,65 @@ def get_process_environment(self, process: Process) -> dict[str, str]:
132201
vLLM with dynamo requires unique ports for each worker:
133202
- DYN_VLLM_KV_EVENT_PORT: ZMQ port for KV events publishing
134203
- VLLM_NIXL_SIDE_CHANNEL_PORT: Port for NIXL side channel transfers
204+
- VLLM_NIXL_SIDE_CHANNEL_HOST: Routable IP for NIXL side channel (not 0.0.0.0/localhost)
135205
"""
206+
from srtctl.core.slurm import get_hostname_ip
207+
136208
env: dict[str, str] = {}
137209
if process.kv_events_port is not None:
138210
env["DYN_VLLM_KV_EVENT_PORT"] = str(process.kv_events_port)
139211
if process.nixl_port is not None:
140212
env["VLLM_NIXL_SIDE_CHANNEL_PORT"] = str(process.nixl_port)
213+
env["VLLM_NIXL_SIDE_CHANNEL_HOST"] = get_hostname_ip(process.node)
141214
return env
142215

216+
def get_mooncake_worker_env(self, infra_node_ip: str, local_hostname: str) -> dict[str, str]:
217+
"""Get mooncake env vars to inject on a specific vLLM worker.
218+
219+
Returns ``{}`` when ``mooncake_kv_store`` is unset. Otherwise:
220+
221+
- ``MOONCAKE_MASTER`` and ``MOONCAKE_TE_META_DATA_SERVER`` are always
222+
stamped by srtslurm (the user can't know the infra IP at config time).
223+
- ``MOONCAKE_LOCAL_HOSTNAME`` defaults to the worker's resolved IP for
224+
multi-node peer transfers, but a value in ``mooncake_kv_store.env``
225+
wins (use this to pin to a specific RDMA NIC IP).
226+
- ``MOONCAKE_CONFIG_PATH`` points to the JSON file srtslurm writes at
227+
job start (mounted into the container at ``/logs``). vLLM's
228+
``MooncakeStoreConnector`` requires this — it does not read the
229+
``MOONCAKE_*`` env vars directly.
230+
"""
231+
if self.mooncake_kv_store is None:
232+
return {}
233+
return {
234+
"MOONCAKE_LOCAL_HOSTNAME": local_hostname,
235+
**self.mooncake_kv_store.env,
236+
"MOONCAKE_MASTER": f"{infra_node_ip}:{MOONCAKE_MASTER_PORT}",
237+
"MOONCAKE_TE_META_DATA_SERVER": (f"http://{infra_node_ip}:{MOONCAKE_HTTP_METADATA_PORT}/metadata"),
238+
"MOONCAKE_CONFIG_PATH": MOONCAKE_STORE_CONFIG_CONTAINER_PATH,
239+
}
240+
241+
def build_mooncake_store_config(self, infra_node_ip: str) -> dict[str, str]:
242+
"""Build the JSON payload for vLLM's ``MooncakeStoreConfig.load_from_env()``.
243+
244+
Keys map 1:1 to vLLM's ``MooncakeStoreConfig`` dataclass. Values come
245+
from ``mooncake_kv_store.store_config`` when set; missing keys fall back
246+
to defaults. ``master_server_address`` is always auto-filled from the
247+
infra node IP (any user-provided value is overridden — the user can't
248+
know the infra IP at config time).
249+
"""
250+
user_cfg: dict[str, str] = {}
251+
if self.mooncake_kv_store is not None and self.mooncake_kv_store.store_config:
252+
user_cfg = dict(self.mooncake_kv_store.store_config)
253+
254+
return {
255+
"metadata_server": user_cfg.get("metadata_server", "P2PHANDSHAKE"),
256+
"master_server_address": f"{infra_node_ip}:{MOONCAKE_MASTER_PORT}",
257+
"global_segment_size": user_cfg.get("global_segment_size", "4GB"),
258+
"local_buffer_size": user_cfg.get("local_buffer_size", "4GB"),
259+
"protocol": user_cfg.get("protocol", "rdma"),
260+
"device_name": user_cfg.get("device_name", ""),
261+
}
262+
143263
def get_served_model_name(self, default: str) -> str:
144264
"""Get served model name from vLLM config, or return default."""
145265
if self.vllm_config:
@@ -193,7 +313,6 @@ def endpoints_to_processes(
193313
self,
194314
endpoints: list[Endpoint],
195315
base_sys_port: int = 8081,
196-
port_allocator: NodePortAllocator | None = None,
197316
) -> list[Process]:
198317
"""Convert endpoints to processes.
199318
@@ -207,13 +326,12 @@ def endpoints_to_processes(
207326

208327
if not has_dp_mode:
209328
# Standard TP mode: one process per node
210-
return endpoints_to_processes(endpoints, base_sys_port=base_sys_port, port_allocator=port_allocator)
329+
return endpoints_to_processes(endpoints, base_sys_port=base_sys_port)
211330

212331
# DP+EP mode: one process per GPU
213332
processes: list[Process] = []
214333
current_sys_port = base_sys_port
215-
if port_allocator is None:
216-
port_allocator = NodePortAllocator()
334+
port_allocator = NodePortAllocator()
217335

218336
for endpoint in endpoints:
219337
if not self._is_dp_mode(endpoint.mode):

0 commit comments

Comments
 (0)