Commit d698c39
committed
[rqd/proto] Add robust GPU support with cross-platform discovery and per-device tracking
Implement Phase 1 of comprehensive GPU support enhancement:
- Issue: [GPU/Proto/RQD/Cuebot/RESTGateway/CueAdmin/CueGUI] OpenCue GPU Support - Comprehensive Audit and Implementation Plan - #2035
Protobuf schema extensions:
- Add `GpuDevice` message with `vendor`, `model`, `memory`, `PCI bus`, `driver version`, and `CUDA`/`Metal` version fields to `host.proto`
- Add `GpuUsage` message for per-device utilization tracking (`util %`, `memory used`)
- Extend `Host` and `NestedHost` messages with `gpu_devices` repeated field
- Extend `RenderHost` with `gpu_devices` for detailed GPU inventory reporting
- Extend `RunningFrameInfo` with `gpu_usage` for per-frame GPU metrics
- Add GPU constraint fields to Layer: `gpu_vendor`, `gpu_models_allowed`, `min_gpu_memory_bytes` for scheduler filtering
- Add `gpu_usage` to `Frame` and `UpdatedFrame` messages for accounting
RQD GPU discovery implementation:
- Implement `GpuDiscovery` abstract base class for pluggable GPU backends
- Implement `NvidiaGpuDiscovery` with `NVML` (`pynvml`) support and `nvidia-smi` fallback for detailed NVIDIA GPU metadata collection
- Implement `AppleMetalGpuDiscovery` for macOS Apple Silicon GPU detection via `system_profiler` JSON parsing
- Update Machine class with platform-specific GPU discovery initialization (Linux - NVIDIA, Darwin - Apple Metal, Windows - NVIDIA)
- Populate `gpu_devices` in `RenderHost` for all platforms (`Linux`, `macOS`, `Windows`)
GPU isolation and monitoring:
- Set `CUDA_VISIBLE_DEVICES` and `NVIDIA_VISIBLE_DEVICES` environment variables in `rqcore.py` for proper GPU isolation in launched frames
- Collect per-device GPU utilization in `__updateGpuAndLlu()` using new `getGpuUtilization()` method
- Add `gpuUsage` list to `RunningFrame` class for tracking per-frame GPU metrics
- Extend `runningFrameInfo()` to include `gpu_usage` in `RunningFrameInfo` proto
Dependencies:
- Add pynvml > = 11.5.0 to `rqd/pyproject.toml` for `NVML` GPU querying
All changes maintain backward compatibility via optional/repeated proto fields.
Legacy `num_gpus` and `gpu_memory` fields preserved for existing clients.1 parent 95d93c2 commit d698c39
File tree
7 files changed
+263
-3
lines changed- proto/src
- rqd
- rqd
7 files changed
+263
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
225 | 225 | | |
226 | 226 | | |
227 | 227 | | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
228 | 248 | | |
229 | 249 | | |
230 | 250 | | |
| |||
274 | 294 | | |
275 | 295 | | |
276 | 296 | | |
| 297 | + | |
277 | 298 | | |
278 | 299 | | |
279 | 300 | | |
| |||
321 | 342 | | |
322 | 343 | | |
323 | 344 | | |
| 345 | + | |
324 | 346 | | |
325 | 347 | | |
326 | 348 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| 16 | + | |
| 17 | + | |
15 | 18 | | |
16 | 19 | | |
17 | 20 | | |
| |||
520 | 523 | | |
521 | 524 | | |
522 | 525 | | |
| 526 | + | |
523 | 527 | | |
524 | 528 | | |
525 | 529 | | |
| |||
566 | 570 | | |
567 | 571 | | |
568 | 572 | | |
| 573 | + | |
569 | 574 | | |
570 | 575 | | |
571 | 576 | | |
| |||
714 | 719 | | |
715 | 720 | | |
716 | 721 | | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
717 | 725 | | |
718 | 726 | | |
719 | 727 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
12 | 14 | | |
13 | 15 | | |
14 | 16 | | |
| |||
82 | 84 | | |
83 | 85 | | |
84 | 86 | | |
85 | | - | |
| 87 | + | |
86 | 88 | | |
87 | 89 | | |
| 90 | + | |
88 | 91 | | |
89 | 92 | | |
90 | 93 | | |
| |||
107 | 110 | | |
108 | 111 | | |
109 | 112 | | |
| 113 | + | |
110 | 114 | | |
111 | 115 | | |
112 | 116 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
820 | 820 | | |
821 | 821 | | |
822 | 822 | | |
823 | | - | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
824 | 828 | | |
825 | 829 | | |
826 | 830 | | |
| |||
0 commit comments