Commit 1fd4356
committed
reclaimspacejob: requeue with delay when node client is not found
When a ReclaimSpaceJob targets a PVC whose application pod has been
deleted, the VolumeAttachment may still show the volume as attached
while the node's CSI addons sidecar connection is no longer available
in the connection pool. This causes nodeReclaimSpace to fail with
"node client not found".
Previously, this error was returned directly to controller-runtime,
which requeues with a fast rate limiter (~5ms exponential backoff).
All 6 retries would exhaust in ~315ms, far too quickly for the
VolumeAttachment to be updated, causing the job to fail immediately.
Requeue with a 30-second interval for this specific error, giving
sufficient time for the VolumeAttachment to be cleaned up. On
subsequent reconciles, getTargetDetails will no longer find the
stale VolumeAttachment and the job can proceed with controller-only
reclaim space.
Signed-off-by: Madhu Rajanna <madhupr007@gmail.com>1 parent fae2bcc commit 1fd4356
1 file changed
Lines changed: 17 additions & 1 deletion
Lines changed: 17 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
62 | 67 | | |
63 | 68 | | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
64 | 73 | | |
65 | 74 | | |
66 | 75 | | |
| |||
159 | 168 | | |
160 | 169 | | |
161 | 170 | | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
162 | 178 | | |
163 | 179 | | |
164 | 180 | | |
| |||
445 | 461 | | |
446 | 462 | | |
447 | 463 | | |
448 | | - | |
| 464 | + | |
449 | 465 | | |
450 | 466 | | |
451 | 467 | | |
| |||
0 commit comments