Skip to content

fix: use safe prometheus.Register instead of MustRegister in main#522

Merged
tekenstam merged 3 commits intomasterfrom
fix/prometheus-safe-register
Mar 23, 2026
Merged

fix: use safe prometheus.Register instead of MustRegister in main#522
tekenstam merged 3 commits intomasterfrom
fix/prometheus-safe-register

Conversation

@tekenstam
Copy link
Member

Summary

  • Replaces prometheus.MustRegister(cacheCollector, controllerCollector) in main.go with individual prometheus.Register(...) calls that handle AlreadyRegisteredError gracefully

Why

prometheus.MustRegister panics if the same collector is registered more than once in the same process. While main() is only called once in normal operation, this pattern is fragile:

  • If any future test exercises a code path that calls main() or triggers metric registration, it will panic on the second run
  • It's inconsistent with best practice for testable Go code

Found during a cross-repo audit after fixing a similar issue in lifecycle-manager (where prometheus.MustRegister in a method called directly from tests was causing EOF errors in Test_Metrics).

Change

// Before
prometheus.MustRegister(cacheCollector, controllerCollector)

// After
if err := prometheus.Register(cacheCollector); err != nil {
    if _, ok := err.(prometheus.AlreadyRegisteredError); !ok {
        setupLog.Error(err, "failed to register cache metrics collector")
        os.Exit(1)
    }
}
if err := prometheus.Register(controllerCollector); err != nil {
    if _, ok := err.(prometheus.AlreadyRegisteredError); !ok {
        setupLog.Error(err, "failed to register controller metrics collector")
        os.Exit(1)
    }
}

No behavior change in production — the first registration always succeeds and the AlreadyRegisteredError branch is never taken. Any other registration error still causes a fatal exit.

Test plan

  • go build ./... — compiles cleanly
  • go test $(go list ./... | grep -v test-bdd) — all unit tests pass

🤖 Generated with Claude Code

prometheus.MustRegister panics if the same collector is registered more
than once in the same process (e.g. in tests, or if main were ever
invoked multiple times). Replace with prometheus.Register and silently
accept AlreadyRegisteredError while still fatally exiting on any other
registration failure.

No behavior change in normal production operation where registration
only happens once.

Signed-off-by: Todd Ekenstam <todd_ekenstam@intuit.com>
@tekenstam tekenstam requested review from a team as code owners March 22, 2026 17:22
@codecov-commenter
Copy link

codecov-commenter commented Mar 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 46.89%. Comparing base (eada4e6) to head (26bcf6a).

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #522   +/-   ##
=======================================
  Coverage   46.89%   46.89%           
=======================================
  Files          40       40           
  Lines        5879     5879           
=======================================
  Hits         2757     2757           
  Misses       2964     2964           
  Partials      158      158           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kevdowney
Copy link
Contributor

The lint check is failing due to it requires go 1.26, we might want to bump that down.

Todd Ekenstam added 2 commits March 22, 2026 20:58
golangci-lint v2.1.1 is compiled with Go 1.24 and panics when
analyzing files that require Go 1.26. Pinning to 1.24 matches
go.mod and avoids the incompatibility with `stable` resolving to 1.26.1.

Signed-off-by: Todd Ekenstam <todd_ekenstam@intuit.com>
The entire codebase uses aws-sdk-go v1 which is now deprecated upstream.
Suppress the staticcheck SA1019 warnings until the migration tracked in
#523 is complete.

Signed-off-by: Todd Ekenstam <todd_ekenstam@intuit.com>
@tekenstam tekenstam merged commit c583fa5 into master Mar 23, 2026
7 checks passed
@tekenstam tekenstam deleted the fix/prometheus-safe-register branch March 23, 2026 13:06
tekenstam pushed a commit that referenced this pull request Mar 23, 2026
Resolves conflict in .github/workflows/golangci-lint.yml by keeping
GO_VERSION: stable and GOLANGCI_LINT_VERSION: v2.11.4 from PR branch
(required for Go 1.26 support). Includes safe prometheus.Register
fix from master (PR #522) and SA1019 suppression rule.

Signed-off-by: Tomas Ekenstam <tekenstam@gmail.com>
Signed-off-by: Todd Ekenstam <todd_ekenstam@intuit.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants