Commit 2b630e8
committed
feat(runtime): add TensorRT-RTX native CUDA graph strategy to C++ runtime
Wire cuda_graph_strategy into the C++ runtime and make the execute_engine
CUDA graph path TensorRT-RTX-aware. Fills in the apply_cuda_graph_strategy
stub and adds coexistence handling for outer whole-graph capture.
What
- apply_cuda_graph_strategy() now calls IRuntimeConfig::setCudaGraphStrategy
with either kDISABLED (default) or kWHOLE_GRAPH_CAPTURE. On RTX this
hands capture/replay off to the TRT-RTX runtime, avoiding the lazy-kernel
and dynamic-shape hazards of wrapping enqueueV3 in at::cuda::CUDAGraph.
- is_monolithic_capturable(stream) returns whether an engine can safely
be captured by an outer torch.cuda.CUDAGraph: RTX builds check
IExecutionContext::isStreamCapturable and require a non-lazy kernel
strategy; non-RTX builds always return true.
- disable_rtx_native_cudagraphs() is a one-shot switch that turns off
the engine internal capture and recreates the execution context so
that outer stream captures contain the kernel launches directly.
- execute_engine.cpp now computes effective_cudagraphs. On RTX, if a
cuda_graph_strategy is set or SUBGRAPH cudagraphs is enabled, it
bypasses the manual at::cuda::CUDAGraph path (the TRT-RTX runtime
handles that inside enqueueV3). It also polls cudaStreamIsCapturing
on the engine stream and, if an outer capture is already running,
invokes disable_rtx_native_cudagraphs() so the outer capture proceeds
without collision.
Why
- On TRT-RTX, the manual at::cuda::CUDAGraph wrapper around enqueueV3
can freeze fallback kernels in the captured graph (kLAZY specialisation
would swap them later), and fails outright when the engine needs
runtime allocation, DDS, control flow, or weight streaming.
- Letting the TRT-RTX runtime own capture fixes both problems, and the
outer-capture detection keeps the feature compatible with the
existing CudaGraphsTorchTensorRTModule whole-graph wrapper without
requiring it to know anything about RTX internals.
Tests
- tests/py/dynamo/runtime/test_000_cuda_graph_strategy.py validates the
setting default, both {disabled, whole_graph_capture} through the
C++ runtime, the RTX-native override when set_cudagraphs_mode(True)
is combined with a strategy, repeated inference correctness, and
ValueError rejection of unknown strategy names.1 parent 481455f commit 2b630e8
4 files changed
Lines changed: 188 additions & 7 deletions
File tree
- core/runtime
- tests/py/dynamo/runtime
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
552 | 552 | | |
553 | 553 | | |
554 | 554 | | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
555 | 582 | | |
556 | 583 | | |
557 | 584 | | |
| |||
605 | 632 | | |
606 | 633 | | |
607 | 634 | | |
608 | | - | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
609 | 641 | | |
610 | 642 | | |
611 | 643 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
236 | 240 | | |
237 | 241 | | |
238 | 242 | | |
239 | 243 | | |
240 | 244 | | |
241 | 245 | | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
242 | 254 | | |
243 | 255 | | |
244 | 256 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
217 | 217 | | |
218 | 218 | | |
219 | 219 | | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
220 | 238 | | |
221 | 239 | | |
222 | 240 | | |
223 | 241 | | |
224 | | - | |
| 242 | + | |
225 | 243 | | |
226 | 244 | | |
227 | 245 | | |
| |||
244 | 262 | | |
245 | 263 | | |
246 | 264 | | |
247 | | - | |
| 265 | + | |
| 266 | + | |
248 | 267 | | |
249 | 268 | | |
250 | 269 | | |
| |||
276 | 295 | | |
277 | 296 | | |
278 | 297 | | |
279 | | - | |
| 298 | + | |
280 | 299 | | |
281 | 300 | | |
282 | 301 | | |
| |||
316 | 335 | | |
317 | 336 | | |
318 | 337 | | |
319 | | - | |
320 | | - | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
321 | 342 | | |
322 | 343 | | |
323 | 344 | | |
| |||
350 | 371 | | |
351 | 372 | | |
352 | 373 | | |
353 | | - | |
| 374 | + | |
354 | 375 | | |
355 | 376 | | |
356 | 377 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
0 commit comments