Skip to content

Commit 002a1f8

Browse files
committed
More improvments to launcher's doc
Signed-off-by: Jun Duan <jun.duan.phd@outlook.com>
1 parent 4e0165c commit 002a1f8

File tree

2 files changed

+9
-13
lines changed

2 files changed

+9
-13
lines changed

docs/launcher.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -577,15 +577,11 @@ You can set environment variables for each instance, useful for:
577577

578578
### Launcher Configuration
579579

580-
#### Command-Line Parameters
580+
#### Command-Line Parameters and Env Vars
581581

582-
```bash
583-
python launcher.py [OPTIONS]
584-
```
585-
586-
**Parameters:**
587-
- `--mock-gpus`: Enable mock GPU mode for CPU-only environments (local dev, CI/CD, Kind clusters). Creates mock GPUs (GPU-0, GPU-1, etc.) and bypasses nvidia-ml-py.
588-
- `--mock-gpu-count <int>`: Number of mock GPUs to create (default: 8). Only used with `--mock-gpus` when ConfigMap discovery is unavailable.
582+
**Command-Line Parameters:**
583+
- `--mock-gpus`: Enable mock GPU mode for CPU-only environments (local dev, CI/CD, Kind clusters). Bypasses nvidia-ml-py. Creates mock GPUs either based on a 'gpu-map' ConfigMap, or by naive enumerating (GPU-0, GPU-1, etc.).
584+
- `--mock-gpu-count <int>`: Number of mock GPUs to create (default: 8). Only used with `--mock-gpus` but ConfigMap discovery is unavailable, thus falling back to naive enumerating of mock GPUs.
589585
- `--host <string>`: Bind address (default: `0.0.0.0`)
590586
- `--port <int>`: API port (default: `8001`)
591587
- `--log-level <string>`: Logging level - `critical`, `error`, `warning`, `info`, `debug` (default: `info`)
@@ -595,12 +591,11 @@ python launcher.py [OPTIONS]
595591
- `NAMESPACE`: Kubernetes namespace for ConfigMap lookup. Required when using ConfigMap-based GPU discovery in mock mode.
596592

597593
**Examples:**
598-
599594
```bash
600595
# Local development (no GPUs)
601596
python launcher.py --mock-gpus --mock-gpu-count 2 --log-level debug
602597

603-
# Production (real GPUs, Kubernetes injects NODE_NAME)
598+
# Production (real GPUs, Kubernetes injects NODE_NAME and NAMESPACE via Downward API for ConfigMap-based GPU discovery)
604599
python launcher.py --port 8001 --log-level info
605600

606601
# Using uvicorn directly

inference_server/launcher/gputranslator.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,9 +43,9 @@ def __init__(
4343
Args:
4444
mock_gpus: If True, skip pynvml and use mock mode for testing
4545
node_name: Kubernetes node name for ConfigMap-based mock GPU discovery.
46-
Required when mock_gpus=True.
46+
Required when mock_gpus=True and using ConfigMap-based mock.
4747
namespace: Kubernetes namespace for ConfigMap-based mock GPU discovery.
48-
Required when mock_gpus=True.
48+
Required when mock_gpus=True and using ConfigMap-based mock.
4949
mock_gpu_count: Number of mock GPUs to create when in mock mode and
5050
ConfigMap-based mock is not available (default: 8).
5151
"""
@@ -136,7 +136,8 @@ def _populate_mapping(self):
136136
"""
137137
Creates mapping and reverse_mapping for the GPU Translator.
138138
Priority order:
139-
1. ConfigMap 'gpu-map' based mock if mock mode enabled and node_name available
139+
1. ConfigMap 'gpu-map' based mock if mock mode enabled and
140+
both node_name and namespace are available
140141
2. Naive mock with GPU-0, GPU-1, etc. if mock mode is enabled
141142
3. Real GPUs via pynvml
142143
"""

0 commit comments

Comments
 (0)