Skip to content

Commit de348e8

Browse files
committed
More improvments to launcher's doc
Signed-off-by: Jun Duan <jun.duan.phd@outlook.com>
1 parent a96f882 commit de348e8

File tree

2 files changed

+9
-13
lines changed

2 files changed

+9
-13
lines changed

docs/launcher.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -579,15 +579,11 @@ You can set environment variables for each instance, useful for:
579579

580580
### Launcher Configuration
581581

582-
#### Command-Line Parameters
582+
#### Command-Line Parameters and Env Vars
583583

584-
```bash
585-
python launcher.py [OPTIONS]
586-
```
587-
588-
**Parameters:**
589-
- `--mock-gpus`: Enable mock GPU mode for CPU-only environments (local dev, CI/CD, Kind clusters). Creates mock GPUs (GPU-0, GPU-1, etc.) and bypasses nvidia-ml-py.
590-
- `--mock-gpu-count <int>`: Number of mock GPUs to create (default: 8). Only used with `--mock-gpus` when ConfigMap discovery is unavailable.
584+
**Command-Line Parameters:**
585+
- `--mock-gpus`: Enable mock GPU mode for CPU-only environments (local dev, CI/CD, Kind clusters). Bypasses nvidia-ml-py. Creates mock GPUs either based on a 'gpu-map' ConfigMap, or by naive enumerating (GPU-0, GPU-1, etc.).
586+
- `--mock-gpu-count <int>`: Number of mock GPUs to create (default: 8). Only used with `--mock-gpus` but ConfigMap discovery is unavailable, thus falling back to naive enumerating of mock GPUs.
591587
- `--host <string>`: Bind address (default: `0.0.0.0`)
592588
- `--port <int>`: API port (default: `8001`)
593589
- `--log-level <string>`: Logging level - `critical`, `error`, `warning`, `info`, `debug` (default: `info`)
@@ -597,12 +593,11 @@ python launcher.py [OPTIONS]
597593
- `NAMESPACE`: Kubernetes namespace for ConfigMap lookup. Required when using ConfigMap-based GPU discovery in mock mode.
598594

599595
**Examples:**
600-
601596
```bash
602597
# Local development (no GPUs)
603598
python launcher.py --mock-gpus --mock-gpu-count 2 --log-level debug
604599

605-
# Production (real GPUs, Kubernetes injects NODE_NAME)
600+
# Production (real GPUs)
606601
python launcher.py --port 8001 --log-level info
607602

608603
# Using uvicorn directly

inference_server/launcher/gputranslator.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,9 @@ def __init__(
4343
Args:
4444
mock_gpus: If True, skip pynvml and use mock mode for testing
4545
node_name: Kubernetes node name for ConfigMap-based mock GPU discovery.
46-
Required when mock_gpus=True.
46+
Required when mock_gpus=True and using ConfigMap-based mock.
4747
namespace: Kubernetes namespace for ConfigMap-based mock GPU discovery.
48-
Required when mock_gpus=True.
48+
Required when mock_gpus=True and using ConfigMap-based mock.
4949
mock_gpu_count: Number of mock GPUs to create when in mock mode and
5050
ConfigMap-based mock is not available (default: 8).
5151
"""
@@ -136,7 +136,8 @@ def _populate_mapping(self):
136136
"""
137137
Creates mapping and reverse_mapping for the GPU Translator.
138138
Priority order:
139-
1. ConfigMap 'gpu-map' based mock if mock mode enabled and node_name available
139+
1. ConfigMap 'gpu-map' based mock if mock mode enabled and
140+
both node_name and namespace are available
140141
2. Naive mock with GPU-0, GPU-1, etc. if mock mode is enabled
141142
3. Real GPUs via pynvml
142143
"""

0 commit comments

Comments
 (0)