Surface CollectorDaemonSetExists status condition on OperatorConfig#1854
Surface CollectorDaemonSetExists status condition on OperatorConfig#1854AnkanMisra wants to merge 6 commits intoGoogleCloudPlatform:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Re-added GenerationChangedPredicate to DaemonSet watcher to avoid noise from status updates (while still catching deletions). Added GenerationChangedPredicate to OperatorConfig watcher to prevent feedback loops from status updates.
Address CodeRabbit feedback: initialize condition status to Unknown instead of True, so that on unexpected errors (network issues, RBAC), the condition reflects uncertainty rather than falsely indicating the DaemonSet exists.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a status subresource for the OperatorConfig custom resource, allowing the operator to report its operational state, including conditions like the existence of the collector DaemonSet. The changes involve updating CRD definitions, API documentation, Go types (OperatorConfigStatus, MonitoringConditionType), and the reconciler logic to manage and update the OperatorConfig's status. The review comments highlight an inaccuracy in the CRD and manifest definitions, where the description for the conditions array incorrectly refers to PodMonitoring instead of OperatorConfig.
| description: Represents the latest available observations of a podmonitor's | ||
| current state. |
There was a problem hiding this comment.
The description for the conditions array appears to be copied from PodMonitoring and is not entirely accurate for OperatorConfig. It currently states "Represents the latest available observations of a podmonitor's current state." This should be updated to reflect that these conditions apply to the OperatorConfig itself, or be more generalized.
description: Represents the latest available observations of an OperatorConfig's
current state.| description: Represents the latest available observations of a podmonitor's current state. | ||
| items: |
There was a problem hiding this comment.
|
@AnkanMisra Thank you for this PR. However, we cannot review and approve PRs without the CLA signed. Gemini Code Assist also flagged a couple changes needed. |
|
@bernot-dev |
Closes #1830
Summary
Adds a
CollectorDaemonSetExistsstatus condition toOperatorConfigto surface when the collector DaemonSet is missing.Problem
When the collector DaemonSet (
gmp-system/collector) is deleted, the operator logs a warning but returns success without surfacing any status condition. This leaves metrics collection silently broken with no way for users to detect the outage via the API.Solution
CollectorDaemonSetExistscondition toOperatorConfig.StatusTruewhen the DaemonSet exists,Falsewith reasonDaemonSetMissingwhen not foundUnknownto handle unexpected API errors correctlyThis only checks DaemonSet existence, not pod readiness, and does not auto recreate deleted DaemonSets