-
Notifications
You must be signed in to change notification settings - Fork 108
feat(stackdriver_exporter): Add ErrorLogger for promhttp #277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(stackdriver_exporter): Add ErrorLogger for promhttp #277
Conversation
04af087
to
29e8667
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
stackdriver_exporter.go
Outdated
return promhttp.HandlerFor(gatherers, promhttp.HandlerOpts{}) | ||
opts := promhttp.HandlerOpts{} | ||
if *monitoringEnablePromHttpCustomLogger { | ||
h.logger.Log("msg", "Enabling custom logger for promhttp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
h.logger.Log("msg", "Enabling custom logger for promhttp") | |
level.Info(h.logger).Log("msg", "Enabling custom logger for promhttp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this needs to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SuperQ I just pushed a change to fix this. Please take a look when you get a chance, thank you!
@SuperQ when you get a chance, mind providing a review? This would really helpful for us to at least get alerted on when we enter this state. |
stackdriver_exporter.go
Outdated
return promhttp.HandlerFor(gatherers, promhttp.HandlerOpts{}) | ||
opts := promhttp.HandlerOpts{} | ||
if *monitoringEnablePromHttpCustomLogger { | ||
h.logger.Log("msg", "Enabling custom logger for promhttp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this needs to be fixed.
1e43528
to
c1559c4
Compare
stackdriver_exporter.go
Outdated
@@ -236,7 +240,14 @@ func (h *handler) innerHandler(filters map[string]bool) http.Handler { | |||
} | |||
|
|||
// Delegate http serving to Prometheus client library, which will call collector.Collect. | |||
return promhttp.HandlerFor(gatherers, promhttp.HandlerOpts{}) | |||
opts := promhttp.HandlerOpts{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be simplified to:
opts := promhttp.HandlerOpts{} | |
opts := promhttp.HandlerOpts{ErrorLog: stdlog.New(log.NewStdlibAdapter(level.Error(h.logger)), "", 0)} |
There's no need to have a new flag for this, just adding the ErrorLog
handler to promhttp is enough.
I had recently experienced prometheus-community#103 and prometheus-community#166 in production and it took quite some time to recognize there was a problem with `stackdriver_exporter` because nothing was logged out to indiciate problems gathering metrics. From my perspective, the pod was healthy and online and I could curl `/metrics` to get results. Grafana Agent however was getting errors when scraping, specifically errors like so: ``` [from Gatherer prometheus-community#2] collected metric "stackdriver_gce_instance_compute_googleapis_com_instance_disk_write_bytes_count" { label:{name:"device_name" value:"REDACTED_FOR_SECURITY"} label:{name:"device_type" value:"permanent"} label:{name:"instance_id" value:"2924941021702260446"} label:{name:"instance_name" value:"REDACTED_FOR_SECURITY"} label:{name:"project_id" value:"REDACTED_FOR_SECURITY"} label:{name:"storage_type" value:"pd-ssd"} label:{name:"unit" value:"By"} label:{name:"zone" value:"us-central1-a"} counter:{value:0} timestamp_ms:1698871080000} was collected before with the same name and label values ``` To help identify the root cause I've added the ability to opt into logging out errors that come from the handler. Specifically, I've created the struct `customPromErrorLogger` that implements the `promhttp.http.Logger` interface. There is a new flag: `monitoring.enable-promhttp-custom-logger` which if it is set to true, then we create an instance of `customPromErrorLogger` and use it as the value for ErrorLogger in `promhttp.Handler{}`. Otherwise, `stackdriver_exporter` works as it did before and does not log out errors collectoing metrics. - refs prometheus-community#103, prometheus-community#166 Signed-off-by: pokom <[email protected]>
Signed-off-by: pokom <[email protected]>
Signed-off-by: pokom <[email protected]>
Signed-off-by: pokom <[email protected]>
e8a6654
to
0e4f42d
Compare
Signed-off-by: pokom <[email protected]>
* [FEATURE] Add ErrorLogger for promhttp #277 * [ENHANCEMENT] Add more info about filters to docs and rename struct fields #198 --------- Signed-off-by: Kyle Eckhart <[email protected]>
I had recently experienced #103 and #166 in production and it took quite some time to recognize there was a problem with
stackdriver_exporter
because nothing was logged out to indiciate problems gathering metrics. From my perspective, the pod was healthy and online and I could curl/metrics
to get results. Grafana Agent however was getting errors when scraping, specifically errors like so:To help identify the root cause I've added the ability to opt into logging out errors that come from the handler. Specifically, I've created the struct
customPromErrorLogger
that implements thepromhttp.http.Logger
interface. There is a new flag:monitoring.enable-promhttp-custom-logger
which if it is set to true, then we create an instance ofcustomPromErrorLogger
and use it as the value for ErrorLogger inpromhttp.Handler{}
. Otherwise,stackdriver_exporter
works as it did before and does not log out errors collectoing metrics.