refactor: single data-integrity check for kueue workloads by andyatmiami · Pull Request #47 · opendatahub-io/odh-cli

andyatmiami · 2026-03-12T20:31:00Z

Description

Replace 6 per-resource-type kueue label checks (Notebook, ISVC,
LLMISVC, RayCluster, RayJob, PyTorchJob) with a single cluster-wide
DataIntegrityCheck that validates three invariants:

Every workload in a kueue-managed namespace has the queue-name label
Every workload with the queue-name label is in a kueue-managed namespace
Within each top-level CR's ownership tree, all resources agree on the
queue-name label value

Key design decisions:

Top-down traversal from monitored CRs (Notebook, InferenceService,
LLMInferenceService, RayCluster, RayJob, PyTorchJob) through an
ownership graph of intermediate types (Deployment, StatefulSet,
ReplicaSet, DaemonSet, Job, CronJob, Pod)
Ownership graph built once per namespace and reused across all CRs
Pre-filter to only process kueue-relevant namespaces (union of
kueue-managed namespaces and namespaces with kueue-labeled workloads)
Single KueueConsistency condition per run with first-violation-wins
per CR to avoid redundant diagnostics

Introduces ImpactProhibited as the highest severity level ("prohibited"),
representing conditions where an upgrade MUST NOT proceed:

Kueue DataIntegrityCheck emits prohibited impact on label violations
Kueue OperatorInstalledCheck escalated to prohibited when Kueue
managementState is Managed — there is no known, reliable upgrade
path from embedded Kueue; migration to the Red Hat build of Kueue
operator is required before upgrading regardless of whether RHBoK
is currently installed. The previous behavior distinguished between
Managed+operator-present (blocking) and Managed+operator-absent (pass);
both cases now emit prohibited unconditionally

Prohibited severity support across the output layer:

SeverityLevelProhibited filter level for --severity flag
Prohibited banner (box-drawn border) rendered above the summary
table when any prohibiteconditions are detected
Double exclamation mark (‼) status icon for prohibited conditions
PROHIBITED verdict in output (replaces FAIL for this severity)
Summary line now includes Prohibited count
Exit error message distinguishes prohibited from blocking

Refactors NotebookVerboseFormatter into EnhancedVerboseFormatter
in pkg/lint/check/:

Handles mixed resource types by deriving per-object CRD FQN from
TypeMeta via per-object AnnotationObjectCRDName, with result-level
annotation preference for single-kind results
Exports CRDFullyQualifiedName and DeriveCRDFQNFromTypeMeta helpers
Supports optional per-object context via AnnotationObjectContext,
rendered as a sub-bullet beneath each impacted object reference
All notebook checks updated to embed check.EnhancedVerboseFormatter
(drop-in replacement, no behavioral change)

Adds resource type definitions for StatefulSet, ReplicaSet, DaemonSet,
Job, and CronJob. Updates ToPartialObjectMetadata helper to preserve
UID and OwnerReferences need for ownership graph construction.

How Has This Been Tested?

Kueue Unmanaged with "inconsistent" Kueue state in environment

Create a 2.25.3 RHOAI instance
Create sample data that includes all the "invariant checks" we want to guard against
- https://gitlab.cee.redhat.com/astonebe/rhoai-helper-cli/-/blob/main/pkg/recipe/recipes/2.25/rhbok/full-e2e-setup.recipe?ref_type=heads
- ℹ️ invariant 3 (owner tree with inconsistent queue-name label) can be tricky to reproduce in a manner that doesn't fail in invariant 1 or invariant 2 - and as such isn't included in the above sample data - but was manually crafted on the cluster
➜ odh-cli/ git:(feat/agressive-kueue-gating) $ ./bin/kubectl-odh lint --target-version 3.3 --verbose

Look for prohibited severity in output

TOP LEVEL BANNER

SUMMARY TABLE

VERBOSE OUTPUT TABLE

RESULT OUTPUT

Kueue Managed

Leverage existing BU cluster ( the "big" / "bad" one I use frequently for regression testing)
- https://console-openshift-console.apps.prod.rhoai.rh-aiservices-bu.com
➜ odh-cli/ git:(feat/agressive-kueue-gating) $ ./bin/kubectl-odh lint --target-version 3.3 --verbose --debug

Look for prohibited severity in output

SUMMARY TABLE

Kueue Removed (and/or not present)

Create a 2.25.3 RHOAI instance *without Kueue
Create simple kueue-less sample data
- https://gitlab.cee.redhat.com/astonebe/rhoai-helper-cli/-/blob/main/pkg/recipe/recipes/2.25/rhoai/full-e2e-setup.recipe?ref_type=heads
➜ odh-cli/ git:(feat/agressive-kueue-gating) $ ./bin/kubectl-odh lint --target-version 3.3 --verbose

SUMMARY TABLE

VERBOSE TABLE

RESULT OUTPUT

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

New Features
- Added a Prohibited impact level that blocks upgrades and surfaces a prominent "Prohibited Violations" banner
- Enhanced diagnostic output with CRD-qualified object names, per-namespace grouping, and object context
- Consolidated Kueue validations into a unified cluster-wide data-integrity check
Bug Fixes
- Improved impact computation and validation messaging (includes Prohibited alongside Blocking/Advisory)
- Broadened workload resource coverage (StatefulSet/ReplicaSet/DaemonSet/Job/CronJob)

coderabbitai · 2026-03-12T20:31:15Z

📝 Walkthrough

Walkthrough

This PR adds a Prohibited impact level and object-level annotation keys to the result package; implements an EnhancedVerboseFormatter and replaces multiple notebook-specific formatters; consolidates several Kueue per-resource label checks into a single cluster-wide DataIntegrityCheck with ownership-graph traversal and three invariants; removes many per-resource Kueue check files and tests; adds several Kubernetes ResourceType definitions; enriches PartialObjectMetadata with UID and OwnerReferences; and updates severity handling, output formatting, and registration to surface Prohibited findings.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Issues

Fragile CRD pluralization heuristic (CWE-20: Improper Input Validation)
File: pkg/lint/check/verbose_formatter.go, function DeriveCRDFQNFromTypeMeta
Risk: irregular plural forms produce incorrect CRD FQNs and silent lookup failures. Action: replace heuristic with API-backed discovery (list CRDs or query server for resource info) or validate derived plural against the server before use.
NULL pointer dereference risk (CWE-476: NULL Pointer Dereference)
File: pkg/lint/checks/workloads/kueue/graph.go, function newGraphNode
Risk: accessing item.OwnerReferences without guarding against nil PartialObjectMetadata or missing ObjectMeta fields. Action: add nil checks and defensive handling for missing metadata; fail gracefully with contextual error.
RBAC/visibility gap in cluster-wide enumeration (CWE-732/CWE-276: Incorrect Permission Assignment / Least Privilege Violation)
File: pkg/lint/checks/workloads/kueue/check.go, method Validate (listWorkloadsInNamespace)
Risk: results may be incomplete under limited permissions, producing false negatives. Action: detect/listing errors that indicate permission denial and surface a blocking condition or annotate results explaining partial visibility; document required permissions.
Unsanitized label comparisons (CWE-20: Improper Input Validation)
File: pkg/lint/checks/workloads/kueue/invariants.go, function checkOwnerTreeConsistency
Risk: raw label values compared without normalization or verification, causing spurious mismatch violations or enabling attacker-controlled labels to influence checks. Action: canonicalize label values, validate allowed characters, and where relevant verify referenced queues exist (or demote to advisory if verification is not possible).
Behavioral regression in Managed-state handling (not strictly security but breaking behavioral change)
File: pkg/lint/checks/components/kueue/kueue_operator_installed.go, function validateManaged
Risk: previous logic allowed Managed+RedHat build; new code marks all Managed as Prohibited. Action: confirm intent with stakeholders; if unintended, restore conditional behavior or document the stricter policy and adjust callers/tests accordingly.
Insufficient error context when listing workloads (CWE-703: Incorrect Handling of Exceptional Conditions)
File: pkg/lint/checks/workloads/kueue/check.go, function listWorkloadsInNamespace
Risk: non-CRD errors are returned without namespace/resource context, hindering debugging. Action: wrap returned errors with namespace and resource type context (fmt.Errorf or errors.Wrap) before returning.

Only actionable findings are listed. Adjustments above focus on crash-safety, accurate diagnostics under RBAC, and robustness of CRD/resource resolution.

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the primary refactoring: replacing six per-resource kueue checks with a single data-integrity check.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

andyatmiami · 2026-03-17T21:21:30Z

/hold

I want to include some screenshots from my own testing - but also want to take this PR out of Draft to signify its ready for review.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (5)

pkg/lint/command_options.go (1)

319-332: Default case returns true, which is maximally permissive.

If a future SeverityLevel value is introduced but not handled here, the fallback silently passes everything. Consider returning false to fail closed, or explicitly panic/log on unexpected values.

Suggested change

 func meetsMinSeverity(impact result.Impact, minLevel SeverityLevel) bool {
 	switch minLevel {
 	case SeverityLevelProhibited:
 		return impact == result.ImpactProhibited
 	case SeverityLevelCritical:
 		return impact == result.ImpactProhibited || impact == result.ImpactBlocking
 	case SeverityLevelWarning:
 		return impact == result.ImpactProhibited || impact == result.ImpactBlocking || impact == result.ImpactAdvisory
 	case SeverityLevelInfo:
 		return true
 	}
 
-	return true
+	// Unknown severity level — fail closed to avoid silently skipping findings.
+	return false
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/lint/command_options.go` around lines 319 - 332, The meetsMinSeverity
function currently falls through to return true for any unhandled SeverityLevel,
which is unsafe; update the switch in meetsMinSeverity to handle all known
SeverityLevel cases and change the default/fallback to return false
(fail-closed) or explicitly log/panic on an unexpected SeverityLevel value so
unknown levels do not silently pass; reference the SeverityLevel enum and
result.Impact constants (e.g., SeverityLevelProhibited, SeverityLevelCritical,
SeverityLevelWarning, SeverityLevelInfo) when updating the switch to ensure
exhaustive handling.

pkg/lint/checks/workloads/kueue/check.go (4)

306-317: buildCRDFQNLookup is rebuilt on every Validate call.

monitoredWorkloadTypes is package-level and static. Consider computing the lookup once at init or lazily with sync.Once to avoid repeated map construction.

Example with sync.Once

var (
	crdfqnLookupOnce sync.Once
	crdfqnLookup     map[string]string
)

func getCRDFQNLookup() map[string]string {
	crdfqnLookupOnce.Do(func() {
		crdfqnLookup = make(map[string]string, len(monitoredWorkloadTypes))
		for _, rt := range monitoredWorkloadTypes {
			key := rt.APIVersion() + "/" + rt.Kind
			crdfqnLookup[key] = rt.CRDFQN()
		}
	})
	return crdfqnLookup
}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/lint/checks/workloads/kueue/check.go` around lines 306 - 317,
buildCRDFQNLookup currently reconstructs the map on every Validate call even
though monitoredWorkloadTypes is static; change this to compute the map once
(either at package init or lazily) by introducing package-level variables (e.g.,
crdfqnLookup and crdfqnLookupOnce) and a getter like getCRDFQNLookup that uses
sync.Once to populate crdfqnLookup from monitoredWorkloadTypes, then update
callers (e.g., Validate) to call getCRDFQNLookup instead of buildCRDFQNLookup or
replace buildCRDFQNLookup with this lazy initializer so the map is not rebuilt
repeatedly.

157-176: First-violation-wins per CR drops invariant 3 checks when invariants 1 or 2 fail.

This is documented behavior, but it means a single CR can have a namespace mismatch and an ownership-tree inconsistency, yet only the former surfaces. If full diagnostics are ever desired, consider accumulating all violations before continue.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/lint/checks/workloads/kueue/check.go` around lines 157 - 176, Current
loop uses early continues after appending a violation from
checkNamespaceToWorkload or checkWorkloadToNamespace, which prevents
checkOwnerTreeConsistency from running for the same CR; change the loop so all
three checks run for each cr and each non-nil result is appended to violations
(i.e., call checkNamespaceToWorkload(cr, kueueNamespaces),
checkWorkloadToNamespace(cr, kueueNamespaces) and checkOwnerTreeConsistency(cr,
graph) without using continue after the first append) so multiple violations per
CR are collected.

229-264: uniqueResources keeps the first violation's message per resource.

If a resource later violates invariant 3 after invariant 1, only the first message is retained. Given the earlier first-violation-wins logic in checkNamespace, this is consistent—but document the coupling to avoid future drift.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/lint/checks/workloads/kueue/check.go` around lines 229 - 264, The
uniqueResources function currently deduplicates violations by resource identity
and keeps the first violation's Message for each resource; explicitly document
this coupling to checkNamespace so future changes don't change semantics. Update
the comment/docstring for uniqueResources (and optionally add a short note on
checkNamespace) to state that when multiple violations exist for the same
resource the first occurrence wins and its Message is retained, referencing the
types and symbols involved (uniqueResources, checkNamespace, violation,
impactedResource) so maintainers understand the intended behavior and won't
unknowingly reorder or filter violations.

186-217: Slice all grows via repeated append without pre-allocation.

Minor allocation churn. If monitoredWorkloadTypes and per-type counts are bounded, this is acceptable. Otherwise, consider collecting counts first or using a pooled buffer.

Optional pre-allocation sketch

 func listWorkloadsInNamespace(
 	ctx context.Context,
 	r client.Reader,
 	namespace string,
 ) ([]*metav1.PartialObjectMetadata, error) {
-	var all []*metav1.PartialObjectMetadata
+	// Estimate initial capacity; exact counts unknown until iteration.
+	all := make([]*metav1.PartialObjectMetadata, 0, 64)
 
 	for _, rt := range monitoredWorkloadTypes {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/lint/checks/workloads/kueue/check.go` around lines 186 - 217, The slice
variable all in the loop grows via repeated append which causes extra
allocations; before appending each batch of items returned by r.ListMetadata
(the items variable inside the monitoredWorkloadTypes loop), ensure capacity is
pre-reserved by checking if cap(all) < len(all)+len(items) and, if so, allocate
a new slice with the larger capacity and copy the existing elements into it so
subsequent appends don't reallocate; adjust this logic around the loop that
constructs metav1.PartialObjectMetadata objects (the obj creation block) so you
pre-expand all's capacity once per batch rather than letting append repeatedly
grow it.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/lint/checks/components/kueue/kueue_operator_installed.go`:
- Around line 84-99: Validate() currently returns early when olm.FindOperator()
errors, which prevents emitting the prohibited condition from validateManaged
for managementState=Managed; update Validate() (in OperatorInstalledCheck) to
ensure validateManaged(req, ...) is invoked even if olm.FindOperator() fails (or
call validateManaged before returning on that error), so the prohibited
condition for managementState Managed is always set; reference validateManaged,
Validate, OperatorInstalledCheck, and olm.FindOperator to locate and adjust the
control flow accordingly.

---

Nitpick comments:
In `@pkg/lint/checks/workloads/kueue/check.go`:
- Around line 306-317: buildCRDFQNLookup currently reconstructs the map on every
Validate call even though monitoredWorkloadTypes is static; change this to
compute the map once (either at package init or lazily) by introducing
package-level variables (e.g., crdfqnLookup and crdfqnLookupOnce) and a getter
like getCRDFQNLookup that uses sync.Once to populate crdfqnLookup from
monitoredWorkloadTypes, then update callers (e.g., Validate) to call
getCRDFQNLookup instead of buildCRDFQNLookup or replace buildCRDFQNLookup with
this lazy initializer so the map is not rebuilt repeatedly.
- Around line 157-176: Current loop uses early continues after appending a
violation from checkNamespaceToWorkload or checkWorkloadToNamespace, which
prevents checkOwnerTreeConsistency from running for the same CR; change the loop
so all three checks run for each cr and each non-nil result is appended to
violations (i.e., call checkNamespaceToWorkload(cr, kueueNamespaces),
checkWorkloadToNamespace(cr, kueueNamespaces) and checkOwnerTreeConsistency(cr,
graph) without using continue after the first append) so multiple violations per
CR are collected.
- Around line 229-264: The uniqueResources function currently deduplicates
violations by resource identity and keeps the first violation's Message for each
resource; explicitly document this coupling to checkNamespace so future changes
don't change semantics. Update the comment/docstring for uniqueResources (and
optionally add a short note on checkNamespace) to state that when multiple
violations exist for the same resource the first occurrence wins and its Message
is retained, referencing the types and symbols involved (uniqueResources,
checkNamespace, violation, impactedResource) so maintainers understand the
intended behavior and won't unknowingly reorder or filter violations.
- Around line 186-217: The slice variable all in the loop grows via repeated
append which causes extra allocations; before appending each batch of items
returned by r.ListMetadata (the items variable inside the monitoredWorkloadTypes
loop), ensure capacity is pre-reserved by checking if cap(all) <
len(all)+len(items) and, if so, allocate a new slice with the larger capacity
and copy the existing elements into it so subsequent appends don't reallocate;
adjust this logic around the loop that constructs metav1.PartialObjectMetadata
objects (the obj creation block) so you pre-expand all's capacity once per batch
rather than letting append repeatedly grow it.

In `@pkg/lint/command_options.go`:
- Around line 319-332: The meetsMinSeverity function currently falls through to
return true for any unhandled SeverityLevel, which is unsafe; update the switch
in meetsMinSeverity to handle all known SeverityLevel cases and change the
default/fallback to return false (fail-closed) or explicitly log/panic on an
unexpected SeverityLevel value so unknown levels do not silently pass; reference
the SeverityLevel enum and result.Impact constants (e.g.,
SeverityLevelProhibited, SeverityLevelCritical, SeverityLevelWarning,
SeverityLevelInfo) when updating the switch to ensure exhaustive handling.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 778554f0-3c23-4f6f-a463-486dc880370f

📥 Commits

Reviewing files that changed from the base of the PR and between 7d65e7f and 5934ee6.

📒 Files selected for processing (39)

pkg/lint/check/result/result.go
pkg/lint/check/result/result_test.go
pkg/lint/check/verbose_formatter.go
pkg/lint/check/verbose_formatter_test.go
pkg/lint/checks/components/kueue/kueue_operator_installed.go
pkg/lint/checks/components/kueue/kueue_operator_installed_test.go
pkg/lint/checks/workloads/kueue/check.go
pkg/lint/checks/workloads/kueue/check_test.go
pkg/lint/checks/workloads/kueue/graph.go
pkg/lint/checks/workloads/kueue/invariants.go
pkg/lint/checks/workloads/kueue/kueue_labels_isvc.go
pkg/lint/checks/workloads/kueue/kueue_labels_isvc_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_llmisvc.go
pkg/lint/checks/workloads/kueue/kueue_labels_llmisvc_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_notebook.go
pkg/lint/checks/workloads/kueue/kueue_labels_notebook_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_pytorchjob.go
pkg/lint/checks/workloads/kueue/kueue_labels_pytorchjob_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_raycluster.go
pkg/lint/checks/workloads/kueue/kueue_labels_raycluster_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_rayjob.go
pkg/lint/checks/workloads/kueue/kueue_labels_rayjob_test.go
pkg/lint/checks/workloads/kueue/support.go
pkg/lint/checks/workloads/notebook/accelerator_migration.go
pkg/lint/checks/workloads/notebook/connection_integrity.go
pkg/lint/checks/workloads/notebook/container_name.go
pkg/lint/checks/workloads/notebook/hardware_profile_integrity.go
pkg/lint/checks/workloads/notebook/hardwareprofile_migration.go
pkg/lint/checks/workloads/notebook/impacted.go
pkg/lint/checks/workloads/notebook/running_workloads.go
pkg/lint/checks/workloads/notebook/verbose_formatter.go
pkg/lint/command.go
pkg/lint/command_options.go
pkg/lint/command_options_test.go
pkg/lint/constants.go
pkg/lint/output_table.go
pkg/lint/output_table_test.go
pkg/resources/types.go
pkg/util/kube/resources.go

💤 Files with no reviewable changes (13)

pkg/lint/checks/workloads/notebook/verbose_formatter.go
pkg/lint/checks/workloads/kueue/kueue_labels_raycluster.go
pkg/lint/checks/workloads/kueue/kueue_labels_notebook.go
pkg/lint/checks/workloads/kueue/kueue_labels_pytorchjob.go
pkg/lint/checks/workloads/kueue/kueue_labels_rayjob_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_isvc_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_pytorchjob_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_llmisvc_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_notebook_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_rayjob.go
pkg/lint/checks/workloads/kueue/kueue_labels_isvc.go
pkg/lint/checks/workloads/kueue/kueue_labels_raycluster_test.go
pkg/lint/checks/workloads/kueue/kueue_labels_llmisvc.go

Replace 6 per-resource-type kueue label checks (Notebook, ISVC, LLMISVC, RayCluster, RayJob, PyTorchJob) with a single cluster-wide DataIntegrityCheck that validates three invariants: 1. Every workload in a kueue-managed namespace has the queue-name label 2. Every workload with the queue-name label is in a kueue-managed namespace 3. Within each top-level CR's ownership tree, all resources agree on the queue-name label value Key design decisions: - Top-down traversal from monitored CRs (Notebook, InferenceService, LLMInferenceService, RayCluster, RayJob, PyTorchJob) through an ownership graph of intermediate types (Deployment, StatefulSet, ReplicaSet, DaemonSet, Job, CronJob, Pod) - Ownership graph built once per namespace and reused across all CRs - Pre-filter to only process kueue-relevant namespaces (union of kueue-managed namespaces and namespaces with kueue-labeled workloads) - Single KueueConsistency condition per run with first-violation-wins per CR to avoid redundant diagnostics Introduces ImpactProhibited as the highest severity level ("prohibited"), representing conditions where an upgrade MUST NOT proceed: - Kueue DataIntegrityCheck emits prohibited impact on label violations - Kueue OperatorInstalledCheck escalated to prohibited when Kueue managementState is Managed — there is no known, reliable upgrade path from embedded Kueue; migration to the Red Hat build of Kueue operator is required before upgrading regardless of whether RHBoK is currently installed. The previous behavior distinguished between Managed+operator-present (blocking) and Managed+operator-absent (pass); both cases now emit prohibited unconditionally Prohibited severity support across the output layer: - SeverityLevelProhibited filter level for --severity flag - Prohibited banner (box-drawn border) rendered above the summary table when any prohibiteconditions are detected - Double exclamation mark (‼) status icon for prohibited conditions - PROHIBITED verdict in output (replaces FAIL for this severity) - Summary line now includes Prohibited count - Exit error message distinguishes prohibited from blocking Refactors NotebookVerboseFormatter into EnhancedVerboseFormatter in pkg/lint/check/: - Handles mixed resource types by deriving per-object CRD FQN from TypeMeta via per-object AnnotationObjectCRDName, with result-level annotation preference for single-kind results - Exports CRDFullyQualifiedName and DeriveCRDFQNFromTypeMeta helpers - Supports optional per-object context via AnnotationObjectContext, rendered as a sub-bullet beneath each impacted object reference - All notebook checks updated to embed check.EnhancedVerboseFormatter (drop-in replacement, no behavioral change) Adds resource type definitions for StatefulSet, ReplicaSet, DaemonSet, Job, and CronJob. Updates ToPartialObjectMetadata helper to preserve UID and OwnerReferences need for ownership graph construction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Andy Stoneberg <astonebe@redhat.com>

andyatmiami force-pushed the feat/agressive-kueue-gating branch 3 times, most recently from 5d1f0da to 4f3f291 Compare March 13, 2026 20:21

valdar requested a review from lburgazzoli March 14, 2026 08:03

andyatmiami force-pushed the feat/agressive-kueue-gating branch from 53359bb to 5934ee6 Compare March 17, 2026 21:20

andyatmiami marked this pull request as ready for review March 17, 2026 21:20

coderabbitai bot reviewed Mar 17, 2026

View reviewed changes

Comment thread pkg/lint/checks/components/kueue/kueue_operator_installed.go

andyatmiami force-pushed the feat/agressive-kueue-gating branch from 5934ee6 to e421b6e Compare March 17, 2026 21:37

lburgazzoli approved these changes Mar 18, 2026

View reviewed changes

lburgazzoli merged commit 95906e8 into opendatahub-io:main Mar 18, 2026
5 checks passed

andyatmiami mentioned this pull request Mar 18, 2026

feat: add aggressive kueue gating and enhanced verbose output red-hat-data-services/odh-cli#184

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: single data-integrity check for kueue workloads#47

refactor: single data-integrity check for kueue workloads#47
lburgazzoli merged 1 commit intoopendatahub-io:mainfrom
andyatmiami:feat/agressive-kueue-gating

andyatmiami commented Mar 12, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 12, 2026 •

edited

Loading

Walkthrough

Estimated code review effort

Issues

Uh oh!

andyatmiami commented Mar 17, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andyatmiami commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Kueue Unmanaged with "inconsistent" Kueue state in environment

Kueue Managed

Kueue Removed (and/or not present)

Merge criteria:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Issues

Uh oh!

andyatmiami commented Mar 17, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

andyatmiami commented Mar 12, 2026 •

edited

Loading

coderabbitai bot commented Mar 12, 2026 •

edited

Loading