Skip to content

Commit 79a8274

Browse files
Merge pull request #672 from vishalanarase/ir-op/reconciliation-events
📖 docs: add proposal for ironic-standalone-operator reconciliation events
2 parents a49a4b2 + 9cae546 commit 79a8274

2 files changed

Lines changed: 116 additions & 0 deletions

File tree

.cspell-config.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@
8282
"F5",
8383
"filesystem",
8484
"finalizer",
85+
"finalizers",
8586
"FIPS",
8687
"foo",
8788
"GB",
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Proposal: Kubernetes Events during reconciliation
2+
3+
## Summary
4+
5+
Ironic Standalone Operator should emit Kubernetes Events on the `Ironic`
6+
custom resource to make meaningful state transitions and actionable
7+
failures visible to cluster users (`kubectl describe ironic ...`), without
8+
flooding the cluster with repetitive messages.
9+
10+
This document proposes a small, stable set of **Event reasons**, their **types**
11+
(Normal/Warning), and the **conditions under which they are recorded**.
12+
13+
## Goals
14+
15+
- Surface **actionable** failures (e.g. invalid version, missing
16+
referenced Secret/ConfigMap).
17+
- Surface **meaningful state transitions** (e.g. ready, version change,
18+
operator-created credentials).
19+
- Avoid event spam: **no** per-reconcile "started/finished" events and **no**
20+
transient operational errors (those belong in logs and
21+
`status.conditions`).
22+
- Keep the contract stable so downstream docs/automation can rely on reasons.
23+
24+
## Non-goals
25+
26+
- Emitting events on every child object create/update (too noisy).
27+
- Emitting events for every internal step in `pkg/ironic` (implementation
28+
detail).
29+
30+
## Background
31+
32+
The controller uses a `record.EventRecorder` and already requests RBAC for
33+
`events` (`create`, `patch`). Events should be attached to the `Ironic` CR
34+
because it is the primary user-facing object.
35+
36+
## Proposed event contract
37+
38+
### General rules (anti-spam)
39+
40+
- Do **not** emit an event at the start or successful end of each reconcile
41+
loop (`Reconciling` / `Reconciled` are intentionally omitted).
42+
- Do **not** emit Warning events for **transient** errors (e.g. API conflicts,
43+
temporary network issues, ensure failures that will be retried); use logs
44+
and conditions instead.
45+
- Emit Normal events for **durable, user-visible changes** (e.g. version
46+
change request, new API credentials Secret created).
47+
- Emit events on **Ready condition transitions** (`False → True` / `True →
48+
False`) with messages aligned to `status.conditions`, not on every
49+
intermediate "in progress" status string.
50+
- Do **not** emit separate "in progress" or “error without Ready flip” events;
51+
represent not-ready and error states through **one** path: the Ready
52+
condition and its message (see `IronicReady` / `IronicNotReady` below).
53+
- Prefer Warning events only for **actionable** or **configuration**
54+
problems.
55+
- Use stable `reason` strings; avoid embedding variable data in `reason`.
56+
- Include variable detail in the message (e.g. namespaced name, versions).
57+
58+
### Event reasons
59+
60+
| Reason | Type | When | Message (template) |
61+
|---|---|---|---|
62+
| `VersionError` | Warning | Invalid/unsupported version configuration (user action required) | `Failed to process version: <error>` |
63+
| `DowngradeRejected` | Warning | Downgrade blocked due to external DB constraint | `Downgrade from <installed> to <requested> is not supported with external database` |
64+
| `VersionChange` | Normal | Requested version changed (upgrade/downgrade requested) | `Version change requested from <installed> to <requested>` |
65+
| `APISecretCreated` | Normal | Operator generated new API credentials Secret | `Created new API credentials secret: <name>` |
66+
| `SecretNotFound` | Warning | User referenced a Secret that does not exist | `secret <ns>/<name> not found` |
67+
| `ConfigMapNotFound` | Warning | User referenced a ConfigMap that does not exist | `configmap <ns>/<name> not found` |
68+
| `IronicReady` | Normal | Ready condition transitions to `True` | `Ironic deployment is now ready` |
69+
| `IronicNotReady` | Warning | Ready condition transitions to `False` (includes configuration errors and dependency failures surfaced there) | `<message from Ready condition>` |
70+
71+
Notes:
72+
73+
- `SecretNotFound` and `ConfigMapNotFound` are user-actionable and should stay
74+
aligned with setting failure reasons in Conditions.
75+
- Any “error” or “not ready” situation that does not flip Ready in the same
76+
reconcile should still be reflected in Conditions first; when Ready becomes
77+
`False`, use **`IronicNotReady`** with the same message users see in
78+
`status.conditions` (no separate `IronicError` reason).
79+
80+
## Deletion, finalizers, and cleanup
81+
82+
Debugging deletion, finalizers, and teardown is an **operator** concern.
83+
Visibility for that belongs in **controller logs**, not in user-facing Events
84+
on the `Ironic` CR. No additional Event reasons are proposed for cleanup
85+
flows.
86+
87+
## Implementation plan (high level)
88+
89+
- Ensure `IronicReconciler` has an `EventRecorder` and it is initialized in
90+
`cmd/main.go` (controller-runtime manager recorder).
91+
- Record events only for the reasons in the table above, following the general
92+
rules.
93+
- Keep emission in the controller layer; do not sprinkle events throughout
94+
lower-level helpers unless necessary.
95+
- Add/verify RBAC permissions for `events` are present in generated RBAC.
96+
97+
## Testing plan
98+
99+
- Add unit tests for event emission in the controller using a fake recorder:
100+
- Ready transition emits `IronicReady` once.
101+
- Missing Secret/ConfigMap emits `SecretNotFound` / `ConfigMapNotFound`.
102+
- Invalid version emits `VersionError` (or equivalent user-error path).
103+
- Assert that **transient** reconcile errors do **not** emit Warning Events
104+
(only logs/conditions).
105+
- Validate via e2e/functional tests (optional, if already covering these
106+
flows):
107+
- Deploy an `Ironic` CR and assert a `IronicReady` event appears when Ready
108+
becomes `True`.
109+
110+
## Compatibility and upgrade considerations
111+
112+
- Event reasons form a user-visible contract; changes should be backwards
113+
compatible when possible.
114+
- If renaming reasons is necessary, keep old reasons for at least one release
115+
cycle (or document clearly in release notes).

0 commit comments

Comments
 (0)