Skip to content

AGENT-1366: Report InternalReleaseImageController errors in IRI status#5803

Open
bfournie wants to merge 1 commit intoopenshift:mainfrom
bfournie:iri-controller
Open

AGENT-1366: Report InternalReleaseImageController errors in IRI status#5803
bfournie wants to merge 1 commit intoopenshift:mainfrom
bfournie:iri-controller

Conversation

@bfournie
Copy link
Contributor

Implemented status condition reporting for the InternalReleaseImageController to report controller errors/issues in the InternalReleaseImage (IRI) status.

- What I did

- How to verify it

- Description for the changelog

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 24, 2026

@bfournie: This pull request references AGENT-1366 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Implemented status condition reporting for the InternalReleaseImageController to report controller errors/issues in the InternalReleaseImage (IRI) status.

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 24, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

The controller's sync now captures a named syncErr and always attempts to update the InternalReleaseImage status Degraded condition on both error and success paths via a new conflict-retry helper that writes the status subresource.

Changes

Cohort / File(s) Summary
Controller Implementation
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go
Introduced a named syncErr return, added a deferred status-update path that sets the Degraded condition based on syncErr, wrapped error returns with context, and added updateInternalReleaseImageStatus which refetches the resource, sets the Degraded condition (SyncError or AsExpected), and updates status with conflict retries.
Tests
pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
Extended TestInternalReleaseImageCreate to assert a successful sync sets Degraded=False with Reason="AsExpected" and added TestInternalReleaseImageStatusOnError to assert Degraded=True with Reason="SyncError" and appropriate messages when prerequisites (ControllerConfig or Secret) are missing; treats API NotFound as nil for assertions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from rwsu and zaneb March 24, 2026 22:48
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Line 340: The error construction assigning to syncErr uses fmt.Errorf("could
not get ControllerConfig %w", err) and is missing the colon before the wrapped
error; update that call to include the colon so it matches the other message
(e.g., change to fmt.Errorf("could not get ControllerConfig: %w", err)) to keep
formatting consistent for the ControllerConfig error path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2303d8c9-1670-4b55-8275-8b77035f94ff

📥 Commits

Reviewing files that changed from the base of the PR and between f86723e and 3e22ff1.

📒 Files selected for processing (2)
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go

@bfournie
Copy link
Contributor Author

/cc @andfasano

@openshift-ci openshift-ci bot requested a review from andfasano March 25, 2026 00:21
@bfournie
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 25, 2026

@bfournie: This pull request references AGENT-1366 which is a valid jira issue.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go (1)

340-394: ⚠️ Potential issue | 🟠 Major

Handle status-update failures in error paths instead of dropping them silently

In Lines 341, 348, 359, 366, 374, 380, 385, and 393, updateInternalReleaseImageStatus(...) errors are ignored. If those calls fail, the controller returns syncErr with no signal that status reporting also failed, which undermines this PR’s core behavior.

Suggested refactor
 func (ctrl *Controller) syncInternalReleaseImage(key string) error {
@@
-	// Track sync error to update status at the end
-	var syncErr error
+	reportSyncError := func(syncErr error) error {
+		if statusErr := ctrl.updateInternalReleaseImageStatus(iri, syncErr); statusErr != nil {
+			klog.Warningf("Failed to update InternalReleaseImage status after sync error: %v (sync error: %v)", statusErr, syncErr)
+		}
+		return syncErr
+	}
@@
 	cconfig, err := ctrl.ccLister.Get(ctrlcommon.ControllerConfigName)
 	if err != nil {
-		syncErr = fmt.Errorf("could not get ControllerConfig: %w", err)
-		ctrl.updateInternalReleaseImageStatus(iri, syncErr)
-		return syncErr
+		return reportSyncError(fmt.Errorf("could not get ControllerConfig: %w", err))
 	}
@@
 	iriSecret, err := ctrl.secretLister.Secrets(ctrlcommon.MCONamespace).Get(ctrlcommon.InternalReleaseImageTLSSecretName)
 	if err != nil {
-		syncErr = fmt.Errorf("could not get Secret %s: %w", ctrlcommon.InternalReleaseImageTLSSecretName, err)
-		ctrl.updateInternalReleaseImageStatus(iri, syncErr)
-		return syncErr
+		return reportSyncError(fmt.Errorf("could not get Secret %s: %w", ctrlcommon.InternalReleaseImageTLSSecretName, err))
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`
around lines 340 - 394, The status-update calls like
ctrl.updateInternalReleaseImageStatus(...) (and the
initializeInternalReleaseImageStatus/ addFinalizerToInternalReleaseImage paths)
currently ignore their returned error; change each to capture the statusErr and,
if non-nil, combine it with the original syncErr (or wrap both) and return the
combined error so callers know both the sync failure and the status-update
failure occurred. Concretely, after setting syncErr (e.g., syncErr =
fmt.Errorf("...: %w", err)), call statusErr :=
ctrl.updateInternalReleaseImageStatus(iri, syncErr); if statusErr != nil {
return fmt.Errorf("%v; status update failed: %w", syncErr, statusErr) } (apply
this pattern to every place that calls ctrl.updateInternalReleaseImageStatus,
plus the calls to ctrl.initializeInternalReleaseImageStatus and
ctrl.addFinalizerToInternalReleaseImage so their errors are checked and
propagated).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 340-394: The status-update calls like
ctrl.updateInternalReleaseImageStatus(...) (and the
initializeInternalReleaseImageStatus/ addFinalizerToInternalReleaseImage paths)
currently ignore their returned error; change each to capture the statusErr and,
if non-nil, combine it with the original syncErr (or wrap both) and return the
combined error so callers know both the sync failure and the status-update
failure occurred. Concretely, after setting syncErr (e.g., syncErr =
fmt.Errorf("...: %w", err)), call statusErr :=
ctrl.updateInternalReleaseImageStatus(iri, syncErr); if statusErr != nil {
return fmt.Errorf("%v; status update failed: %w", syncErr, statusErr) } (apply
this pattern to every place that calls ctrl.updateInternalReleaseImageStatus,
plus the calls to ctrl.initializeInternalReleaseImageStatus and
ctrl.addFinalizerToInternalReleaseImage so their errors are checked and
propagated).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 78822c2d-cb45-4f29-b47e-2d08c78d3c53

📥 Commits

Reviewing files that changed from the base of the PR and between 3e22ff1 and a7f10da.

📒 Files selected for processing (2)
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go

@bfournie
Copy link
Contributor Author

/test unit

@bfournie
Copy link
Contributor Author

/retest-required

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 413-415: The status-update failures are currently only logged (at
the success path and where statusErr is set) which hides transient write errors;
change the reconcile code to return the error instead of klog.Warningf so
failures bubble up and trigger requeue, and update
updateInternalReleaseImageStatus to avoid reading from the cached ctrl.iriLister
inside its retry loop—use the API server via ctrl.client.Get to fetch the latest
InternalReleaseImage prior to each UpdateStatus attempt to avoid stale
resourceVersion conflicts; also propagate errors from the earlier statusErr site
in the same manner.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 46960fb6-ce53-4623-9be0-e3000343918b

📥 Commits

Reviewing files that changed from the base of the PR and between a7f10da and 66a0fed.

📒 Files selected for processing (2)
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go

@bfournie
Copy link
Contributor Author

/retest-required

Implemented status condition reporting for the InternalReleaseImageController
to report controller errors/issues in the InternalReleaseImage (IRI) status.
@bfournie
Copy link
Contributor Author

/retest

@andfasano
Copy link
Contributor

/test e2e-agent-compact-ipv4-iso-no-registry

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 26, 2026

@bfournie: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@andfasano
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 26, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 26, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andfasano, bfournie

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants