Skip to content

fix: mark NodeScanJob as failed when referenced Node is missing#1235

Open
fabriziosestito wants to merge 5 commits into
kubewarden:mainfrom
fabriziosestito:fix/node-notfound-logic
Open

fix: mark NodeScanJob as failed when referenced Node is missing#1235
fabriziosestito wants to merge 5 commits into
kubewarden:mainfrom
fabriziosestito:fix/node-notfound-logic

Conversation

@fabriziosestito

@fabriziosestito fabriziosestito commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Description

  • When the node referenced by a NodeScanJob no longer exists, the job is now marked as failed with a NodeNotFound reason instead of being silently deleted. This matches how the controller already handles other missing dependencies (e.g. a missing registry for a regular scan job).
  • Moved the node existence check into the same validation function that already verifies the configuration and node selectors, so all preconditions are checked in one place.
  • The node scan workers now stop cleanly if the job they are working on gets deleted mid-flight. Previously a deletion during processing could surface as a hard error and cause the message to be retried needlessly.
  • Before publishing the follow-up scan message, the SBOM generator now checks that the job still exists, so we don't kick off a vulnerability scan for work that has already been cancelled.
  • Added missing tests.

…en the referenced Node is missing

Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
…deleted mid-flight

Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
…OMHandler

Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Copilot AI review requested due to automatic review settings June 16, 2026 08:05
@fabriziosestito fabriziosestito requested a review from a team as a code owner June 16, 2026 08:05
@github-project-automation github-project-automation Bot moved this to Pending Review in SBOMscanner Jun 16, 2026
…ureHandler

Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
@fabriziosestito fabriziosestito force-pushed the fix/node-notfound-logic branch from 16e61f7 to 32ec2f1 Compare June 16, 2026 08:08

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates node-scanning behavior so NodeScanJobs are failed (with a dedicated NodeNotFound reason) when their referenced Node is missing, instead of deleting the job, and improves worker-side handling when jobs are deleted mid-processing to avoid unnecessary retries/follow-up work.

Changes:

  • Mark NodeScanJob as Failed with ReasonNodeScanJobNodeNotFound when the referenced Node does not exist (and consolidate precondition checks).
  • Make node SBOM generation/scanning handlers stop cleanly when the NodeScanJob disappears mid-flight; skip publishing follow-up scan messages if the job no longer exists.
  • Rename/standardize SBOM scan handler naming and add/extend tests around the new behaviors.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
internal/handlers/scan_sbom.go Renames the image SBOM scan handler type to ScanSBOMHandler.
internal/handlers/nodescanjob_failure_test.go Adds tests for marking NodeScanJob failed via the failure handler.
internal/handlers/node_scan_sbom.go Treats NodeScanJob deletion during status updates as a non-fatal stop condition.
internal/handlers/node_scan_sbom_test.go Adds tests for node SBOM scanning flows (including stop-processing scenarios).
internal/handlers/generate_node_sbom.go Stops cleanly on deleted jobs and checks job existence before publishing follow-up scan messages.
internal/controller/nodescanjob_controller.go Moves node existence checks into validation and marks jobs failed with NodeNotFound instead of deleting.
internal/controller/nodescanjob_controller_test.go Updates controller tests to assert failure-with-reason instead of deletion for missing nodes.
cmd/worker/main.go Updates worker wiring to use the renamed node scan SBOM handler constructor.
api/v1alpha1/nodescanjob_types.go Adds ReasonNodeScanJobNodeNotFound constant.
Comments suppressed due to low confidence (1)

internal/controller/nodescanjob_controller.go:230

  • SetupWithManager no longer watches corev1.Node events. That means if a NodeScanJob is pending and its referenced Node is deleted later, the controller may never be triggered again, so the job won’t get marked Failed with NodeNotFound (the new validation only runs during reconciliation). Consider re-adding a Node watch (enqueueing NodeScanJobs by spec.nodeName) or introducing a periodic requeue for pending jobs so node deletions are observed.
	err := ctrl.NewControllerManagedBy(mgr).
		For(&v1alpha1.NodeScanJob{}).
		WithOptions(controller.Options{
			MaxConcurrentReconciles: maxConcurrentReconciles,
		}).
		Complete(r)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/handlers/node_scan_sbom_test.go Outdated
Comment thread internal/handlers/node_scan_sbom_test.go Outdated
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
@fabriziosestito fabriziosestito force-pushed the fix/node-notfound-logic branch from 32ec2f1 to 223ae55 Compare June 16, 2026 08:13
@fabriziosestito fabriziosestito self-assigned this Jun 16, 2026
@fabriziosestito fabriziosestito added the bug Something isn't working label Jun 16, 2026
@fabriziosestito fabriziosestito added this to the v0.12.0 milestone Jun 16, 2026
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 44.82759% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 51.69%. Comparing base (5adb1bd) to head (223ae55).
⚠️ Report is 36 commits behind head on main.

Files with missing lines Patch % Lines
internal/handlers/generate_node_sbom.go 0.00% 7 Missing and 1 partial ⚠️
internal/handlers/node_scan_sbom.go 0.00% 6 Missing ⚠️
cmd/worker/main.go 0.00% 1 Missing ⚠️
internal/controller/nodescanjob_controller.go 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1235      +/-   ##
==========================================
- Coverage   53.44%   51.69%   -1.75%     
==========================================
  Files          61       77      +16     
  Lines        5340     6474    +1134     
==========================================
+ Hits         2854     3347     +493     
- Misses       2088     2662     +574     
- Partials      398      465      +67     
Flag Coverage Δ
unit-tests 51.69% <44.82%> (-1.75%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Pending Review

Development

Successfully merging this pull request may close these issues.

2 participants