Skip to content

Pull requests: NVIDIA/nvidia-resiliency-ext

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat: add non-retryable exception pattern matching
#212 opened Oct 28, 2025 by hexinw-nvidia Loading…
[FR_attribution] Fix issues with stale entries ci-approved Approved to run CI
#210 opened Oct 27, 2025 by sbak5 Loading…
Added GPU memory logger. ci-approved Approved to run CI
#206 opened Oct 21, 2025 by hexinw-nvidia Loading…
CAS profiling
#188 opened Sep 21, 2025 by hexinw-nvidia Draft
Auto restart ci-approved Approved to run CI
#139 opened Aug 6, 2025 by hexinw-nvidia Draft
Add example for multimodal models ci-approved Approved to run CI
#131 opened Jul 25, 2025 by Ava-A4098 Loading…
Added in-process wrapper restart latency
#118 opened Jul 13, 2025 by namitdhameja Loading…
Test UT. ci-approved Approved to run CI
#79 opened May 17, 2025 by hexinw-nvidia Draft
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.