Commit 6d649c0
fix: catalyst poll — set driver_override + drivers_probe in poll loop
After fire-and-poll clears the driver symlink (~2s), the device has no
driver_override set (still "nvsov" from deferred_insmod). vfio-pci won't
auto-claim without correct override + probe.
Catalyst warm_swap now polls with guarded override+probe writes (5s
timeout each, retry on failure). On GV100: override set in 50ms,
probe sent in 50ms, vfio-pci bound within 5s. Total warm_swap: 7s.
Full pipeline timings on fresh boot:
insmod nvsov: 400ms
catalyst capture: instant (20 registers)
fire-and-poll: 2s (driver symlink cleared)
poll for vfio: 5s (override+probe+rebind)
BAR0 snapshot: ~1s (173K registers, 18 domains)
rmmod nvsov: 100ms
Known remaining gap: post-swap steps (sibling rebind, tier classify,
BAR0 capture) can block 5-7min when nvidia RM teardown child is still
running in the kernel background. The PCI lock contention delays sysfs
operations. The BAR0 capture itself is fast once it starts.
Co-authored-by: Cursor <cursoragent@cursor.com>1 parent e129c3e commit 6d649c0
1 file changed
Lines changed: 35 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1064 | 1064 | | |
1065 | 1065 | | |
1066 | 1066 | | |
1067 | | - | |
1068 | | - | |
1069 | | - | |
1070 | | - | |
1071 | | - | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
1072 | 1074 | | |
1073 | 1075 | | |
1074 | | - | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
1075 | 1079 | | |
1076 | 1080 | | |
1077 | 1081 | | |
1078 | 1082 | | |
1079 | 1083 | | |
1080 | 1084 | | |
1081 | 1085 | | |
1082 | | - | |
1083 | | - | |
| 1086 | + | |
| 1087 | + | |
1084 | 1088 | | |
1085 | 1089 | | |
1086 | 1090 | | |
1087 | 1091 | | |
1088 | 1092 | | |
1089 | 1093 | | |
1090 | 1094 | | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
| 1103 | + | |
| 1104 | + | |
| 1105 | + | |
| 1106 | + | |
| 1107 | + | |
| 1108 | + | |
| 1109 | + | |
| 1110 | + | |
| 1111 | + | |
| 1112 | + | |
| 1113 | + | |
| 1114 | + | |
| 1115 | + | |
| 1116 | + | |
| 1117 | + | |
1091 | 1118 | | |
1092 | 1119 | | |
1093 | 1120 | | |
| |||
0 commit comments