/kind bug
ROSARoleConfig controller panic: nil pointer dereference in reconcileAccountRoles()
The ROSARoleConfig controller intermittently crashes with a nil pointer dereference panic during the first reconciliation of new ROSARoleConfig resources. The crash occurs in the reconcileAccountRoles() function when attempting to access r.Runtime.AWSClient.
Impact:
- Some ROSARoleConfig resources successfully create AWS IAM roles (14s-2m37s)
- Others fail and remain stuck for 15+ minutes before timeout
- The crash is non-deterministic, making ROSA HCP cluster provisioning unreliable
Full Panic Stack Trace:
E0205 09:57:27.912082 1 signal_unix.go:925] "Observed a panic"
controller="rosaroleconfig"
controllerGroup="infrastructure.cluster.x-k8s.io"
controllerKind="ROSARoleConfig"
ROSARoleConfig="ns-rosa-hcp/abc-ui-rosa-hcp-test-roles"
namespace="ns-rosa-hcp"
name="abc-ui-rosa-hcp-test-roles"
reconcileID="3872e30c-f6c7-404d-89f0-4d9ad903821d"
panic="runtime error: invalid memory address or nil pointer dereference"
panicGoValue="\"invalid memory address or nil pointer dereference\""
stacktrace=<
panic({0x5357620?, 0x8a61240?})
/usr/lib/golang/src/runtime/panic.go:792 +0x132
Controller Error:
E0205 09:57:27.912147 1 controller.go:347] "Reconciler error"
err="panic: runtime error: invalid memory address or nil pointer dereference [recovered]"
controller="rosaroleconfig"
controllerGroup="infrastructure.cluster.x-k8s.io"
controllerKind="ROSARoleConfig"
ROSARoleConfig="ns-rosa-hcp/abc-ui-rosa-hcp-test-roles"
namespace="ns-rosa-hcp"
name="abc-ui-rosa-hcp-test-roles"
reconcileID="3872e30c-f6c7-404d-89f0-4d9ad903821d"
Additional Context
The crash is recoverable at times:
Successful ROSARoleConfigs (recovered from crash or avoided it):
- test-rosa-hcp-roles: 14 seconds
- new-rosa-hcp-test-roles: 2 minutes 37 seconds
Failed ROSARoleConfigs (stuck in crash loop):
- tst-rosa-hcp-roles: 15+ minutes before manual deletion
- abc-ui-rosa-hcp-test-roles: Crashed on first reconcile
/kind bug
ROSARoleConfig controller panic: nil pointer dereference in reconcileAccountRoles()
The ROSARoleConfig controller intermittently crashes with a nil pointer dereference panic during the first reconciliation of new ROSARoleConfig resources. The crash occurs in the reconcileAccountRoles() function when attempting to access r.Runtime.AWSClient.
Impact:
Additional Context
The crash is recoverable at times:
Successful ROSARoleConfigs (recovered from crash or avoided it):
Failed ROSARoleConfigs (stuck in crash loop):