ServerMaintenance: optionally turn on Locator LED#948
Conversation
Closes #421 Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
|
Warning Review limit reached
More reviews will be available in 38 minutes and 50 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more credits in the billing tab to continue. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds locator LED control to server maintenance operations. It introduces a new ChangesLocator LED Feature
Sequence DiagramsequenceDiagram
participant User
participant ServerMaintenance Reconciler
participant Server Reconciler
participant BMC Client
participant Redfish System
User->>ServerMaintenance Reconciler: Create ServerMaintenance with LocatorLED
ServerMaintenance Reconciler->>Server Reconciler: Patch Server.Spec.IndicatorLED to LitIndicatorLED
Server Reconciler->>Server Reconciler: ensureIndicatorLED compares desired vs current
Server Reconciler->>BMC Client: SetIndicatorLED(systemURI, LitIndicatorLED)
BMC Client->>Redfish System: Update system.IndicatorLED
Redfish System-->>BMC Client: System updated
BMC Client-->>Server Reconciler: Success
User->>ServerMaintenance Reconciler: Delete ServerMaintenance
ServerMaintenance Reconciler->>Server Reconciler: Patch Server.Spec.IndicatorLED to OffIndicatorLED
Server Reconciler->>BMC Client: SetIndicatorLED(systemURI, OffIndicatorLED)
BMC Client->>Redfish System: Update system.IndicatorLED
Redfish System-->>BMC Client: System updated
BMC Client-->>Server Reconciler: Success
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
internal/controller/servermaintenance_controller.go (1)
38-40:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd RBAC markers for Server resource.
The
ServerMaintenanceReconcilerpatches and updatesServerobjects (lines 254, 275, 311, 337, 385, etc.), but RBAC markers for themetal.ironcore.devserversresource are missing. Without these permissions, the controller ServiceAccount won't be authorized to modify servers. As per coding guidelines, include RBAC markers specifying all required permissions.🔐 Add missing RBAC markers
// +kubebuilder:rbac:groups=metal.ironcore.dev,resources=servermaintenances,verbs=get;list;watch;create;update;patch;delete // +kubebuilder:rbac:groups=metal.ironcore.dev,resources=servermaintenances/status,verbs=get;update;patch // +kubebuilder:rbac:groups=metal.ironcore.dev,resources=servermaintenances/finalizers,verbs=update +// +kubebuilder:rbac:groups=metal.ironcore.dev,resources=servers,verbs=get;list;watch;update;patch +// +kubebuilder:rbac:groups=metal.ironcore.dev,resources=serverclaims,verbs=get;list;patch +// +kubebuilder:rbac:groups=metal.ironcore.dev,resources=serverbootconfigurations,verbs=get;create;delete;patchNote: Also add markers for
serverclaimsandserverbootconfigurationswhich this controller modifies.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/servermaintenance_controller.go` around lines 38 - 40, Add RBAC markers granting this controller permissions to manage the Server, ServerClaim, and ServerBootConfiguration resources: update the top of servermaintenance_controller.go to include +kubebuilder:rbac lines for group=metal.ironcore.dev resources=servers,servers/status,servers/finalizers and likewise for serverclaims and serverbootconfigurations, specifying the verbs the reconciler uses (e.g., get;list;watch;create;update;patch;delete for resources, and get;update;patch for /status, update for /finalizers) so ServerMaintenanceReconciler can patch/update those objects at runtime.Source: Coding guidelines
internal/controller/server_controller.go (1)
583-612:⚠️ Potential issue | 🟠 MajorCall
ensureIndicatorLEDduringServerStateMaintenanceto keep BMC locator LED in sync
internal/controller/server_controller.goonly callsr.ensureIndicatorLED(...)fromhandleAvailableStateandhandleReservedState;handleMaintenanceStatedoes not, soserver.Spec.IndicatorLEDcan change without the BMC LED being reasserted during Maintenance. Add anensureIndicatorLEDcall inhandleMaintenanceState(keep it alongside the other per-reconcile BMC operations).🔧 Proposed fix
func (r *ServerReconciler) handleMaintenanceState(ctx context.Context, bmcClient bmc.BMC, server *metalv1alpha1.Server) (bool, error) { log := ctrl.LoggerFrom(ctx) if server.Spec.ServerMaintenanceRef == nil { log.V(1).Info("Server is in Maintenance state, but no ServerMaintenanceRef is set, transitioning back to previous state") // update system info in case the server was changed during Maintenance state (hardwere changes, biosVersion etc.) if err := r.updateServerStatusFromSystemInfo(ctx, bmcClient, server); err != nil { return false, fmt.Errorf("failed to update server status system info: %w", err) } if err := bmcClient.ClearBootOverride(ctx, server.Spec.SystemURI); err != nil { return false, fmt.Errorf("failed to clear boot override on maintenance exit: %w", err) } if server.Spec.ServerClaimRef == nil { return r.patchServerState(ctx, server, metalv1alpha1.ServerStateInitial) } return r.patchServerState(ctx, server, metalv1alpha1.ServerStateReserved) } // Re-assert the persistent network-boot override on every reconcile while in Maintenance. // This protects against reboots not driven by metal-operator (e.g. a vendor BIOS // upgrade task rebooting the system itself) falling through to disk and starting // the production OS while the host is still being worked on. if err := bmcClient.SetBootOverride(ctx, server.Spec.SystemURI, true); err != nil { return false, fmt.Errorf("failed to set persistent network boot for maintenance: %w", err) } if err := r.ensureServerPowerState(ctx, bmcClient, server); err != nil { return false, fmt.Errorf("failed to ensure server power state: %w", err) } + + if err := r.ensureIndicatorLED(ctx, bmcClient, server); err != nil { + return false, fmt.Errorf("failed to ensure server indicator led: %w", err) + } log.V(1).Info("Reconciled maintenance state") return false, nil }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/server_controller.go` around lines 583 - 612, handleMaintenanceState is missing a call to r.ensureIndicatorLED, so changes to server.Spec.IndicatorLED are not applied while in maintenance; add a call to r.ensureIndicatorLED(ctx, bmcClient, server) in handleMaintenanceState alongside the existing per-reconcile BMC ops (e.g., after SetBootOverride and before/after ensureServerPowerState), and handle its returned error like the others (wrap and return an error on failure) so the BMC locator LED is kept in sync during ServerStateMaintenance.
🧹 Nitpick comments (4)
api/v1alpha1/servermaintenance_types.go (1)
48-51: ⚡ Quick winClarify the LED cleanup behavior in the documentation.
The comment "When maintenance ends, the locator LED is turned off" could be interpreted as "the server's LED is always turned off" or "the LED set by this field is turned off." The implementation only clears the LED when
LocatorLEDis set (not empty). Consider rephrasing for clarity.📝 Suggested documentation improvement
// LocatorLED specifies the desired state of the server's locator LED during maintenance. -// When maintenance ends, the locator LED is turned off. +// If set, the locator LED is cleared (turned off) when maintenance ends. // +optional LocatorLED IndicatorLED `json:"locatorLED,omitempty"`🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@api/v1alpha1/servermaintenance_types.go` around lines 48 - 51, Update the comment for LocatorLED (type IndicatorLED) to clarify that only the LED state explicitly set by the LocatorLED field is cleared at the end of maintenance; specifically, rephrase "When maintenance ends, the locator LED is turned off" to something like "When maintenance ends, the LED state specified by LocatorLED is cleared/turned off — if LocatorLED is unset/empty no change is made." This change should be made next to the LocatorLED field declaration to accurately reflect the behavior implemented for LocatorLED.internal/controller/servermaintenance_controller.go (1)
254-258: ⚡ Quick winUpdate log message to reflect LED state changes.
The method
setAndPatchServerStatenow patches both power and (optionally) LED state, but the log message at line 257 only mentions power state. Update the message to reflect that LED might also be set.💬 Suggested log message improvement
if err := r.setAndPatchServerState(ctx, server, maintenance); err != nil { return ctrl.Result{}, err } - log.V(1).Info("Patched server power state", "Server", server.Name, "Power", maintenance.Spec.ServerPower) + log.V(1).Info("Patched server state", "Server", server.Name, "Power", maintenance.Spec.ServerPower, "IndicatorLED", maintenance.Spec.LocatorLED)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/servermaintenance_controller.go` around lines 254 - 258, The log message after calling setAndPatchServerState should mention both power and LED changes; update the log.V(1).Info call (the one that currently logs "Patched server power state") to include the LED state as well by referencing maintenance.Spec.ServerPower and the LED field (e.g., maintenance.Spec.ServerLED or the exact LED field name used in the Maintenance type) along with server.Name so the log reflects that setAndPatchServerState may have patched power and/or LED state.internal/controller/server_controller.go (1)
1167-1177: ⚡ Quick winConsider adding structured logging for LED synchronization.
Similar to the BMC layer, logging LED state changes at the controller level would aid troubleshooting when onsite technicians report LED issues. This is particularly valuable for ServerMaintenance scenarios where the LED is used to locate servers requiring physical intervention.
📝 Suggested logging addition
func (r *ServerReconciler) ensureIndicatorLED(ctx context.Context, bmcClient bmc.BMC, server *metalv1alpha1.Server) error { + log := ctrl.LoggerFrom(ctx) if server.Spec.IndicatorLED == "" { return nil } desired := schemas.IndicatorLED(server.Spec.IndicatorLED) //nolint:staticcheck current := schemas.IndicatorLED(server.Status.IndicatorLED) //nolint:staticcheck if desired == current { return nil } + log.V(1).Info("Synchronizing indicator LED", "desired", desired, "current", current) return bmcClient.SetIndicatorLED(ctx, server.Spec.SystemURI, desired) }As per coding guidelines, use structured logging with key-value pairs following Kubernetes conventions.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/server_controller.go` around lines 1167 - 1177, Add structured logging in ensureIndicatorLED: log when IndicatorLED is empty (debug), when desired != current (Info) with key-values server name, namespace, systemURI, desired and current, and after calling bmcClient.SetIndicatorLED log success or failure; on error include the returned error in the log. Use the reconciler logger (r.Log.WithValues(...).Info/Error(...)) and reference the function ensureIndicatorLED, variables server, desired, current, and the call to bmcClient.SetIndicatorLED to locate where to insert these logs.Source: Coding guidelines
bmc/redfish.go (1)
205-213: ⚡ Quick winConsider adding structured logging for LED state changes.
The locator LED is specifically meant to help onsite technicians locate physical hardware. Logging when the LED state is changed would help troubleshoot cases where technicians report the LED is not lit as expected.
📝 Suggested logging addition
func (r *RedfishBaseBMC) SetIndicatorLED(ctx context.Context, systemURI string, state schemas.IndicatorLED) error { //nolint:staticcheck + log := ctrl.LoggerFrom(ctx) system, err := r.getSystemFromUri(ctx, systemURI) if err != nil { return fmt.Errorf("failed to get system: %w", err) } + log.V(1).Info("Setting indicator LED", "SystemURI", systemURI, "State", state) system.IndicatorLED = state //nolint:staticcheck return system.Update() }As per coding guidelines, follow Kubernetes logging conventions: use structured logging with key-value pairs, capitalize the message, and use active voice.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@bmc/redfish.go` around lines 205 - 213, Add structured, Kubernetes-style logging to SetIndicatorLED: after retrieving the system via getSystemFromUri and before calling system.Update(), log the LED change with a capitalized, active-voice message and key-value pairs (e.g., "Changing Indicator LED", system ID/URI, previous state, desired state). Use the existing receiver logger on RedfishBaseBMC (e.g., r.log or the project's logger instance) and ensure the log is emitted on both success path and on error returns (include the error when system.Update() fails) so callers can correlate failures to the attempted state change.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@internal/controller/server_controller.go`:
- Around line 583-612: handleMaintenanceState is missing a call to
r.ensureIndicatorLED, so changes to server.Spec.IndicatorLED are not applied
while in maintenance; add a call to r.ensureIndicatorLED(ctx, bmcClient, server)
in handleMaintenanceState alongside the existing per-reconcile BMC ops (e.g.,
after SetBootOverride and before/after ensureServerPowerState), and handle its
returned error like the others (wrap and return an error on failure) so the BMC
locator LED is kept in sync during ServerStateMaintenance.
In `@internal/controller/servermaintenance_controller.go`:
- Around line 38-40: Add RBAC markers granting this controller permissions to
manage the Server, ServerClaim, and ServerBootConfiguration resources: update
the top of servermaintenance_controller.go to include +kubebuilder:rbac lines
for group=metal.ironcore.dev resources=servers,servers/status,servers/finalizers
and likewise for serverclaims and serverbootconfigurations, specifying the verbs
the reconciler uses (e.g., get;list;watch;create;update;patch;delete for
resources, and get;update;patch for /status, update for /finalizers) so
ServerMaintenanceReconciler can patch/update those objects at runtime.
---
Nitpick comments:
In `@api/v1alpha1/servermaintenance_types.go`:
- Around line 48-51: Update the comment for LocatorLED (type IndicatorLED) to
clarify that only the LED state explicitly set by the LocatorLED field is
cleared at the end of maintenance; specifically, rephrase "When maintenance
ends, the locator LED is turned off" to something like "When maintenance ends,
the LED state specified by LocatorLED is cleared/turned off — if LocatorLED is
unset/empty no change is made." This change should be made next to the
LocatorLED field declaration to accurately reflect the behavior implemented for
LocatorLED.
In `@bmc/redfish.go`:
- Around line 205-213: Add structured, Kubernetes-style logging to
SetIndicatorLED: after retrieving the system via getSystemFromUri and before
calling system.Update(), log the LED change with a capitalized, active-voice
message and key-value pairs (e.g., "Changing Indicator LED", system ID/URI,
previous state, desired state). Use the existing receiver logger on
RedfishBaseBMC (e.g., r.log or the project's logger instance) and ensure the log
is emitted on both success path and on error returns (include the error when
system.Update() fails) so callers can correlate failures to the attempted state
change.
In `@internal/controller/server_controller.go`:
- Around line 1167-1177: Add structured logging in ensureIndicatorLED: log when
IndicatorLED is empty (debug), when desired != current (Info) with key-values
server name, namespace, systemURI, desired and current, and after calling
bmcClient.SetIndicatorLED log success or failure; on error include the returned
error in the log. Use the reconciler logger
(r.Log.WithValues(...).Info/Error(...)) and reference the function
ensureIndicatorLED, variables server, desired, current, and the call to
bmcClient.SetIndicatorLED to locate where to insert these logs.
In `@internal/controller/servermaintenance_controller.go`:
- Around line 254-258: The log message after calling setAndPatchServerState
should mention both power and LED changes; update the log.V(1).Info call (the
one that currently logs "Patched server power state") to include the LED state
as well by referencing maintenance.Spec.ServerPower and the LED field (e.g.,
maintenance.Spec.ServerLED or the exact LED field name used in the Maintenance
type) along with server.Name so the log reflects that setAndPatchServerState may
have patched power and/or LED state.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e078b18a-4b99-4144-afac-1cc0cd1c8a09
📒 Files selected for processing (9)
api/v1alpha1/applyconfiguration/api/v1alpha1/servermaintenancespec.goapi/v1alpha1/applyconfiguration/internal/internal.goapi/v1alpha1/servermaintenance_types.gobmc/bmc.gobmc/redfish.goconfig/crd/bases/metal.ironcore.dev_servermaintenances.yamlinternal/controller/server_controller.gointernal/controller/servermaintenance_controller.gointernal/controller/servermaintenance_controller_test.go
Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
Closes #421
Summary by CodeRabbit
Release Notes