|
| 1 | +# Proposal: Kubernetes Events during reconciliation |
| 2 | + |
| 3 | +## Summary |
| 4 | + |
| 5 | +Ironic Standalone Operator should emit Kubernetes Events on the `Ironic` |
| 6 | +custom resource to make meaningful state transitions and actionable |
| 7 | +failures visible to cluster users (`kubectl describe ironic ...`), without |
| 8 | +flooding the cluster with repetitive messages. |
| 9 | + |
| 10 | +This document proposes a small, stable set of **Event reasons**, their **types** |
| 11 | +(Normal/Warning), and the **conditions under which they are recorded**. |
| 12 | + |
| 13 | +## Goals |
| 14 | + |
| 15 | +- Surface **actionable** failures (e.g. invalid version, missing |
| 16 | + referenced Secret/ConfigMap). |
| 17 | +- Surface **meaningful state transitions** (e.g. ready, version change, |
| 18 | + operator-created credentials). |
| 19 | +- Avoid event spam: **no** per-reconcile "started/finished" events and **no** |
| 20 | + transient operational errors (those belong in logs and |
| 21 | + `status.conditions`). |
| 22 | +- Keep the contract stable so downstream docs/automation can rely on reasons. |
| 23 | + |
| 24 | +## Non-goals |
| 25 | + |
| 26 | +- Emitting events on every child object create/update (too noisy). |
| 27 | +- Emitting events for every internal step in `pkg/ironic` (implementation |
| 28 | + detail). |
| 29 | + |
| 30 | +## Background |
| 31 | + |
| 32 | +The controller uses a `record.EventRecorder` and already requests RBAC for |
| 33 | +`events` (`create`, `patch`). Events should be attached to the `Ironic` CR |
| 34 | +because it is the primary user-facing object. |
| 35 | + |
| 36 | +## Proposed event contract |
| 37 | + |
| 38 | +### General rules (anti-spam) |
| 39 | + |
| 40 | +- Do **not** emit an event at the start or successful end of each reconcile |
| 41 | + loop (`Reconciling` / `Reconciled` are intentionally omitted). |
| 42 | +- Do **not** emit Warning events for **transient** errors (e.g. API conflicts, |
| 43 | + temporary network issues, ensure failures that will be retried); use logs |
| 44 | + and conditions instead. |
| 45 | +- Emit Normal events for **durable, user-visible changes** (e.g. version |
| 46 | + change request, new API credentials Secret created). |
| 47 | +- Emit events on **Ready condition transitions** (`False → True` / `True → |
| 48 | + False`) with messages aligned to `status.conditions`, not on every |
| 49 | + intermediate "in progress" status string. |
| 50 | +- Do **not** emit separate "in progress" or “error without Ready flip” events; |
| 51 | + represent not-ready and error states through **one** path: the Ready |
| 52 | + condition and its message (see `IronicReady` / `IronicNotReady` below). |
| 53 | +- Prefer Warning events only for **actionable** or **configuration** |
| 54 | + problems. |
| 55 | +- Use stable `reason` strings; avoid embedding variable data in `reason`. |
| 56 | +- Include variable detail in the message (e.g. namespaced name, versions). |
| 57 | + |
| 58 | +### Event reasons |
| 59 | + |
| 60 | +| Reason | Type | When | Message (template) | |
| 61 | +|---|---|---|---| |
| 62 | +| `VersionError` | Warning | Invalid/unsupported version configuration (user action required) | `Failed to process version: <error>` | |
| 63 | +| `DowngradeRejected` | Warning | Downgrade blocked due to external DB constraint | `Downgrade from <installed> to <requested> is not supported with external database` | |
| 64 | +| `VersionChange` | Normal | Requested version changed (upgrade/downgrade requested) | `Version change requested from <installed> to <requested>` | |
| 65 | +| `APISecretCreated` | Normal | Operator generated new API credentials Secret | `Created new API credentials secret: <name>` | |
| 66 | +| `SecretNotFound` | Warning | User referenced a Secret that does not exist | `secret <ns>/<name> not found` | |
| 67 | +| `ConfigMapNotFound` | Warning | User referenced a ConfigMap that does not exist | `configmap <ns>/<name> not found` | |
| 68 | +| `IronicReady` | Normal | Ready condition transitions to `True` | `Ironic deployment is now ready` | |
| 69 | +| `IronicNotReady` | Warning | Ready condition transitions to `False` (includes configuration errors and dependency failures surfaced there) | `<message from Ready condition>` | |
| 70 | + |
| 71 | +Notes: |
| 72 | + |
| 73 | +- `SecretNotFound` and `ConfigMapNotFound` are user-actionable and should stay |
| 74 | + aligned with setting failure reasons in Conditions. |
| 75 | +- Any “error” or “not ready” situation that does not flip Ready in the same |
| 76 | + reconcile should still be reflected in Conditions first; when Ready becomes |
| 77 | + `False`, use **`IronicNotReady`** with the same message users see in |
| 78 | + `status.conditions` (no separate `IronicError` reason). |
| 79 | + |
| 80 | +## Deletion, finalizers, and cleanup |
| 81 | + |
| 82 | +Debugging deletion, finalizers, and teardown is an **operator** concern. |
| 83 | +Visibility for that belongs in **controller logs**, not in user-facing Events |
| 84 | +on the `Ironic` CR. No additional Event reasons are proposed for cleanup |
| 85 | +flows. |
| 86 | + |
| 87 | +## Implementation plan (high level) |
| 88 | + |
| 89 | +- Ensure `IronicReconciler` has an `EventRecorder` and it is initialized in |
| 90 | + `cmd/main.go` (controller-runtime manager recorder). |
| 91 | +- Record events only for the reasons in the table above, following the general |
| 92 | + rules. |
| 93 | +- Keep emission in the controller layer; do not sprinkle events throughout |
| 94 | + lower-level helpers unless necessary. |
| 95 | +- Add/verify RBAC permissions for `events` are present in generated RBAC. |
| 96 | + |
| 97 | +## Testing plan |
| 98 | + |
| 99 | +- Add unit tests for event emission in the controller using a fake recorder: |
| 100 | + - Ready transition emits `IronicReady` once. |
| 101 | + - Missing Secret/ConfigMap emits `SecretNotFound` / `ConfigMapNotFound`. |
| 102 | + - Invalid version emits `VersionError` (or equivalent user-error path). |
| 103 | +- Assert that **transient** reconcile errors do **not** emit Warning Events |
| 104 | + (only logs/conditions). |
| 105 | +- Validate via e2e/functional tests (optional, if already covering these |
| 106 | + flows): |
| 107 | + - Deploy an `Ironic` CR and assert a `IronicReady` event appears when Ready |
| 108 | + becomes `True`. |
| 109 | + |
| 110 | +## Compatibility and upgrade considerations |
| 111 | + |
| 112 | +- Event reasons form a user-visible contract; changes should be backwards |
| 113 | + compatible when possible. |
| 114 | +- If renaming reasons is necessary, keep old reasons for at least one release |
| 115 | + cycle (or document clearly in release notes). |
0 commit comments