Skip to content

metrics, metrics/prometheus: fix histogram count and sum to be cumulative#34658

Open
knQzx wants to merge 1 commit intoethereum:masterfrom
knQzx:fix-prometheus-counter-monotonicity
Open

metrics, metrics/prometheus: fix histogram count and sum to be cumulative#34658
knQzx wants to merge 1 commit intoethereum:masterfrom
knQzx:fix-prometheus-counter-monotonicity

Conversation

@knQzx
Copy link
Copy Markdown

@knQzx knQzx commented Apr 4, 2026

Summary

  • _count and _sum values for prometheus histograms using ResettingSample and ResettingTimer were resetting on every scrape instead of being monotonically increasing
  • this broke rate(), increase(), and average latency calculations in prometheus/grafana
  • added cumulative tracking across snapshots so counters only go up, as the prometheus spec requires

What changed

  • resetting_timer.go: added totalCount/totalSum fields that accumulate across snapshots, exposed via TotalCount()/TotalSum()
  • resetting_sample.go: added cumulative count/sum tracking with mutex, snapshot now returns cumulative values
  • prometheus/collector.go: switched to TotalCount() for the prometheus _count metric
  • added 3 tests verifying cumulative behavior across multiple snapshots and scrapes

Test plan

  • all existing metrics tests pass
  • 3 new tests verify cumulative counters across snapshots
  • per-window Count() for non-prometheus consumers (influxdb, exp) unchanged

fixes #34628

resetting timers and resetting samples now track cumulative count and
sum across snapshots so that prometheus _count and _sum metrics
monotonically increase as required by the prometheus spec. the per-window
values and percentiles still reset normally.
@knQzx knQzx changed the title fix prometheus histogram count and sum to be cumulative metrics, metrics/prometheus: fix histogram count and sum to be cumulative Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prometheus metrics violate counter monotonicity convention: _count and _sum reset on every scrape due to ResettingSample

1 participant