pkg/services/otelhealth: fix uptime_seconds metric type from Gauge to Counter#1847
pkg/services/otelhealth: fix uptime_seconds metric type from Gauge to Counter#1847
Conversation
✅ API Diff Results - No breaking changes |
There was a problem hiding this comment.
Pull request overview
Aligns the OpenTelemetry health-check uptime metric with the existing Prometheus implementation by reporting total accumulated uptime rather than repeatedly recording the fixed polling interval.
Changes:
- Switch
uptime_secondsfrom an OTelFloat64GaugetoFloat64Counter. - Update reporting from
Record()toAdd()so uptime accumulates over time. - Clarify the metric description to indicate it is measured in seconds.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return services.HealthCheckerConfig{}, err | ||
| } | ||
| uptimeSeconds, err := meter.Float64Gauge("uptime_seconds", metric.WithDescription("Uptime of the service")) | ||
| uptimeSeconds, err := meter.Float64Counter("uptime_seconds", metric.WithDescription("Uptime of the service in seconds")) |
There was a problem hiding this comment.
uptime_seconds is now a Counter and the description says "Uptime of the service in seconds", but it doesn’t indicate the value is cumulative/monotonic (i.e., total uptime accumulated since start). Consider updating the description to explicitly reflect counter semantics to avoid confusion in dashboards/alerts.
Also, this metric represents seconds; most other OTel duration metrics in this repo set an explicit unit (e.g. metric.WithUnit("s") in pkg/settings/limits/time.go). Consider adding metric.WithUnit("s") here as well for consistency and better backend rendering.
| uptimeSeconds, err := meter.Float64Counter("uptime_seconds", metric.WithDescription("Uptime of the service in seconds")) | |
| uptimeSeconds, err := meter.Float64Counter( | |
| "uptime_seconds", | |
| metric.WithDescription("Total cumulative uptime of the service in seconds since start"), | |
| metric.WithUnit("s"), | |
| ) |
Summary
uptime_secondsOTel metric fromFloat64GaugetoFloat64Counterand useAdd()instead ofRecord()promhealth) already correctly uses a Counter — this aligns the OTel implementation to match