fix: mark NodeScanJob as failed when referenced Node is missing#1235
fix: mark NodeScanJob as failed when referenced Node is missing#1235fabriziosestito wants to merge 5 commits into
Conversation
…en the referenced Node is missing Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
…deleted mid-flight Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
…OMHandler Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
…ureHandler Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
16e61f7 to
32ec2f1
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates node-scanning behavior so NodeScanJobs are failed (with a dedicated NodeNotFound reason) when their referenced Node is missing, instead of deleting the job, and improves worker-side handling when jobs are deleted mid-processing to avoid unnecessary retries/follow-up work.
Changes:
- Mark
NodeScanJobas Failed withReasonNodeScanJobNodeNotFoundwhen the referenced Node does not exist (and consolidate precondition checks). - Make node SBOM generation/scanning handlers stop cleanly when the
NodeScanJobdisappears mid-flight; skip publishing follow-up scan messages if the job no longer exists. - Rename/standardize SBOM scan handler naming and add/extend tests around the new behaviors.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/handlers/scan_sbom.go | Renames the image SBOM scan handler type to ScanSBOMHandler. |
| internal/handlers/nodescanjob_failure_test.go | Adds tests for marking NodeScanJob failed via the failure handler. |
| internal/handlers/node_scan_sbom.go | Treats NodeScanJob deletion during status updates as a non-fatal stop condition. |
| internal/handlers/node_scan_sbom_test.go | Adds tests for node SBOM scanning flows (including stop-processing scenarios). |
| internal/handlers/generate_node_sbom.go | Stops cleanly on deleted jobs and checks job existence before publishing follow-up scan messages. |
| internal/controller/nodescanjob_controller.go | Moves node existence checks into validation and marks jobs failed with NodeNotFound instead of deleting. |
| internal/controller/nodescanjob_controller_test.go | Updates controller tests to assert failure-with-reason instead of deletion for missing nodes. |
| cmd/worker/main.go | Updates worker wiring to use the renamed node scan SBOM handler constructor. |
| api/v1alpha1/nodescanjob_types.go | Adds ReasonNodeScanJobNodeNotFound constant. |
Comments suppressed due to low confidence (1)
internal/controller/nodescanjob_controller.go:230
SetupWithManagerno longer watchescorev1.Nodeevents. That means if a NodeScanJob is pending and its referenced Node is deleted later, the controller may never be triggered again, so the job won’t get marked Failed withNodeNotFound(the new validation only runs during reconciliation). Consider re-adding a Node watch (enqueueing NodeScanJobs byspec.nodeName) or introducing a periodic requeue for pending jobs so node deletions are observed.
err := ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.NodeScanJob{}).
WithOptions(controller.Options{
MaxConcurrentReconciles: maxConcurrentReconciles,
}).
Complete(r)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
32ec2f1 to
223ae55
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1235 +/- ##
==========================================
- Coverage 53.44% 51.69% -1.75%
==========================================
Files 61 77 +16
Lines 5340 6474 +1134
==========================================
+ Hits 2854 3347 +493
- Misses 2088 2662 +574
- Partials 398 465 +67
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Description
NodeScanJobno longer exists, the job is now marked as failed with aNodeNotFoundreason instead of being silently deleted. This matches how the controller already handles other missing dependencies (e.g. a missing registry for a regular scan job).