fix: guard against nil RootKeySecretRef in TUF init job#1733
fix: guard against nil RootKeySecretRef in TUF init job#1733
Conversation
During an upgrade, the RootKeySecretRef field may be nil if the resource was created by an older operator version. This causes a nil pointer dereference panic in EnsureTufInitJob. Return a clear error instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PR TypeBug fix, Tests Description
|
| Relevant files | |||
|---|---|---|---|
| Bug fix |
| ||
| Tests |
|
Code Review by Qodo
|
| if instance.Spec.RootKeySecretRef == nil { | ||
| return fmt.Errorf("rootKeySecretRef is not set") | ||
| } |
There was a problem hiding this comment.
1. Non-terminal missing rootkey 🐞 Bug ☼ Reliability
When RootKeySecretRef is nil, EnsureTufInitJob returns a plain error, and initJobAction wraps/returns it as a non-terminal error so the controller won’t set ReadyCondition=False and will keep retrying job creation. This undermines the goal of a “clear error state” during upgrades because the failure may only appear in logs/reconcile errors, not in status conditions.
Agent Prompt
### Issue description
`EnsureTufInitJob` returns a non-terminal error when `spec.rootKeySecretRef` is missing. The `tuf-init` action propagates this as a non-terminal reconcile error, so the operator may keep retrying without surfacing a clear `Ready` failure condition.
### Issue Context
This occurs on upgrade paths where older stored CRs can have `spec.rootKeySecretRef` absent.
### Fix Focus Areas
- internal/controller/tuf/actions/tuf_init_job.go[96-122]
- internal/controller/tuf/utils/tuf_init_job.go[24-32]
### Implementation notes
- Add a precondition check in `initJobAction.ensureInitJob`:
- if `instance.Spec.RootKeySecretRef == nil` (and ideally also `Name == ""`), return `i.Error(ctx, reconcile.TerminalError(fmt.Errorf("...")), instance)` (or pass an explicit failure condition).
- Optionally keep the util-level guard as defense-in-depth, but ensure the action surfaces the condition clearly.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Code Review by Qodo
|
Add TestMigrateJob_NilRootKeySecretRef to verify the migration action handles the case where TUF signing keys are stored externally and RootKeySecretRef is nil. Asserts that: - a terminal error is returned with a clear message - the deployment is not touched (non-destructive) - no migration job is created Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
| func EnsureTufInitJob(instance *rhtasv1alpha1.Tuf, sa string, labels map[string]string, oidcIssuers []string) func(*batchv1.Job) error { | ||
| return func(job *batchv1.Job) error { | ||
| if instance.Spec.RootKeySecretRef == nil { | ||
| return fmt.Errorf("rootKeySecretRef is not set") |
There was a problem hiding this comment.
Throw a Terminal error - this is a configuration error and we should move to Failure status.
Summary
RootKeySecretRefmay be nil if the field wasn't present in the original CRD.EnsureTufInitJobdereferences it without a nil check, causing a panic.rootKeySecretRef is not set) instead of panicking.--export-keysargument).Supersedes #1732
Related: SECURESIGN-4125
Test plan
TestEnsureTufInitJob_NilRootKeySecretRef— verifies nilRootKeySecretRefproduces a clean error, no panicTestEnsureTufInitJob_WithRootKeySecretRef— verifies normal operation and correct job args🤖 Generated with Claude Code