Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions docs/user_guide/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,12 +82,10 @@ For Linux+GPU devices, Parallax provides a docker environment for quick setup. C

Run a docker container as below. Please note that generally the argument ```--gpus all``` is necessary for the docker to run on GPUs.
```sh
# For Blackwell
docker run -it --gpus all --network host gradientservice/parallax:latest-blackwell bash
# For Ampere/Hopper
docker run -it --gpus all --network host gradientservice/parallax:latest-hopper bash
# For Blackwell/Ampere/Hopper
docker run -it --gpus all --network host gradientservice/parallax:latest bash
# For DGX Spark
docker run -it --gpus all --network host gradientservice/parallax:spark-spark bash
docker run -it --gpus all --network host gradientservice/parallax:latest-spark bash
```
The container starts under parallax workspace and you should be able to run parallax directly.

Expand Down
4 changes: 3 additions & 1 deletion src/parallax/launch.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,9 @@
)
args.start_layer = gradient_server.block_start_index
args.end_layer = gradient_server.block_end_index
args.model_path = gradient_server.model_name
# Only read model_name from scheduler if model_path is not set, so we can use local path as model_path
if args.model_path is None:
args.model_path = gradient_server.model_name
args.tp_size = gradient_server.tp_size

logger.debug(
Expand Down
2 changes: 1 addition & 1 deletion src/scheduling/request_routing.py
Original file line number Diff line number Diff line change
Expand Up @@ -430,7 +430,7 @@ def find_optimal_path(self, nodes: List[Node], num_layers: int) -> Tuple[List[st
prev = node
self._rr_cursor += 1
attempts += 1
if viable:
if viable and total_latency != float("inf"):
return candidate_ids, total_latency
# Attempt a one-shot repair if the selected pipeline is not viable
repaired = self._attempt_repair_pipeline(candidate_ids, nodes, num_layers)
Expand Down