feat(stackdriver_exporter): Add ErrorLogger for promhttp #277

Pokom · 2023-11-03T14:04:50Z

I had recently experienced #103 and #166 in production and it took quite some time to recognize there was a problem with stackdriver_exporter because nothing was logged out to indiciate problems gathering metrics. From my perspective, the pod was healthy and online and I could curl /metrics to get results. Grafana Agent however was getting errors when scraping, specifically errors like so:

 [from Gatherer #2] collected metric "stackdriver_gce_instance_compute_googleapis_com_instance_disk_write_bytes_count" { label:{name:"device_name"
value:"REDACTED_FOR_SECURITY"} label:{name:"device_type"  value:"permanent"} label:{name:"instance_id" value:"2924941021702260446"} label:{name:"instance_name"  value:"REDACTED_FOR_SECURITY"} label:{name:"project_id" value:"REDACTED_FOR_SECURITY"}  label:{name:"storage_type" value:"pd-ssd"} label:{name:"unit" value:"By"} label:{name:"zone" value:"us-central1-a"}
counter:{value:0} timestamp_ms:1698871080000} was collected before with the same name and label values

To help identify the root cause I've added the ability to opt into logging out errors that come from the handler. Specifically, I've created the struct customPromErrorLogger that implements the promhttp.http.Logger interface. There is a new flag: monitoring.enable-promhttp-custom-logger which if it is set to true, then we create an instance of customPromErrorLogger and use it as the value for ErrorLogger in promhttp.Handler{}. Otherwise, stackdriver_exporter works as it did before and does not log out errors collectoing metrics.

refs How to workaround duplicate metrics from Stackdriver #103, Duplicate metrics from Stackdriver for stackdriver_pubsub_topic_pubsub_googleapis_com_topic metrics #166

kgeckhart

Looks good!

kgeckhart · 2023-11-03T15:27:45Z

stackdriver_exporter.go

-	return promhttp.HandlerFor(gatherers, promhttp.HandlerOpts{})
+	opts := promhttp.HandlerOpts{}
+	if *monitoringEnablePromHttpCustomLogger {
+		h.logger.Log("msg", "Enabling custom logger for promhttp")


Nit:

Suggested change

h.logger.Log("msg", "Enabling custom logger for promhttp")

level.Info(h.logger).Log("msg", "Enabling custom logger for promhttp")

Yes, this needs to be fixed.

@SuperQ I just pushed a change to fix this. Please take a look when you get a chance, thank you!

Pokom · 2023-11-10T14:09:27Z

@SuperQ when you get a chance, mind providing a review? This would really helpful for us to at least get alerted on when we enter this state.

SuperQ · 2024-03-07T15:59:14Z

stackdriver_exporter.go

-	return promhttp.HandlerFor(gatherers, promhttp.HandlerOpts{})
+	opts := promhttp.HandlerOpts{}
+	if *monitoringEnablePromHttpCustomLogger {
+		h.logger.Log("msg", "Enabling custom logger for promhttp")


Yes, this needs to be fixed.

SuperQ · 2024-04-14T10:31:52Z

stackdriver_exporter.go

@@ -236,7 +240,14 @@ func (h *handler) innerHandler(filters map[string]bool) http.Handler {
 	}

 	// Delegate http serving to Prometheus client library, which will call collector.Collect.
-	return promhttp.HandlerFor(gatherers, promhttp.HandlerOpts{})
+	opts := promhttp.HandlerOpts{}


This can be simplified to:

Suggested change

opts := promhttp.HandlerOpts{}

opts := promhttp.HandlerOpts{ErrorLog: stdlog.New(log.NewStdlibAdapter(level.Error(h.logger)), "", 0)}

There's no need to have a new flag for this, just adding the ErrorLog handler to promhttp is enough.

I had recently experienced prometheus-community#103 and prometheus-community#166 in production and it took quite some time to recognize there was a problem with `stackdriver_exporter` because nothing was logged out to indiciate problems gathering metrics. From my perspective, the pod was healthy and online and I could curl `/metrics` to get results. Grafana Agent however was getting errors when scraping, specifically errors like so: ``` [from Gatherer prometheus-community#2] collected metric "stackdriver_gce_instance_compute_googleapis_com_instance_disk_write_bytes_count" { label:{name:"device_name" value:"REDACTED_FOR_SECURITY"} label:{name:"device_type" value:"permanent"} label:{name:"instance_id" value:"2924941021702260446"} label:{name:"instance_name" value:"REDACTED_FOR_SECURITY"} label:{name:"project_id" value:"REDACTED_FOR_SECURITY"} label:{name:"storage_type" value:"pd-ssd"} label:{name:"unit" value:"By"} label:{name:"zone" value:"us-central1-a"} counter:{value:0} timestamp_ms:1698871080000} was collected before with the same name and label values ``` To help identify the root cause I've added the ability to opt into logging out errors that come from the handler. Specifically, I've created the struct `customPromErrorLogger` that implements the `promhttp.http.Logger` interface. There is a new flag: `monitoring.enable-promhttp-custom-logger` which if it is set to true, then we create an instance of `customPromErrorLogger` and use it as the value for ErrorLogger in `promhttp.Handler{}`. Otherwise, `stackdriver_exporter` works as it did before and does not log out errors collectoing metrics. - refs prometheus-community#103, prometheus-community#166 Signed-off-by: pokom <[email protected]>

Signed-off-by: pokom <[email protected]>

* [FEATURE] Add ErrorLogger for promhttp #277 * [ENHANCEMENT] Add more info about filters to docs and rename struct fields #198 --------- Signed-off-by: Kyle Eckhart <[email protected]>

Pokom force-pushed the feat/add-custom-logger-to-prom-handler branch from 04af087 to 29e8667 Compare November 3, 2023 14:05

kgeckhart approved these changes Nov 3, 2023

View reviewed changes

Pokom marked this pull request as ready for review November 3, 2023 21:06

SuperQ requested changes Mar 7, 2024

View reviewed changes

Pokom force-pushed the feat/add-custom-logger-to-prom-handler branch from 1e43528 to c1559c4 Compare March 18, 2024 13:47

Pokom requested a review from SuperQ March 18, 2024 13:48

SuperQ requested changes Apr 14, 2024

View reviewed changes

Pokom added 4 commits June 3, 2024 13:20

Use level.Info(...) instead of directly calling logger

fe9a710

Signed-off-by: pokom <[email protected]>

Simplify options to use stdlog interface

2bea4c5

Signed-off-by: pokom <[email protected]>

Remove unused flag from readme

0e4f42d

Signed-off-by: pokom <[email protected]>

Pokom force-pushed the feat/add-custom-logger-to-prom-handler branch from e8a6654 to 0e4f42d Compare June 3, 2024 17:21

Undo formatting changes to README table

9bb10a7

Signed-off-by: pokom <[email protected]>

Pokom requested a review from SuperQ June 3, 2024 17:24

SuperQ requested a review from kgeckhart June 4, 2024 06:54

kgeckhart approved these changes Jun 4, 2024

View reviewed changes

SuperQ approved these changes Jul 2, 2024

View reviewed changes

SuperQ merged commit d50e9fb into prometheus-community:master Jul 2, 2024
4 checks passed

Pokom deleted the feat/add-custom-logger-to-prom-handler branch July 3, 2024 09:21

kgeckhart mentioned this pull request Jul 12, 2024

Release 0.16.0 #348

Merged

SuperQ pushed a commit that referenced this pull request Jul 15, 2024

Release 0.16.0 (#348)

9df88d8

* [FEATURE] Add ErrorLogger for promhttp #277 * [ENHANCEMENT] Add more info about filters to docs and rename struct fields #198 --------- Signed-off-by: Kyle Eckhart <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(stackdriver_exporter): Add ErrorLogger for promhttp #277

feat(stackdriver_exporter): Add ErrorLogger for promhttp #277

Uh oh!

Pokom commented Nov 3, 2023

Uh oh!

kgeckhart left a comment

Uh oh!

kgeckhart Nov 3, 2023

Uh oh!

SuperQ Mar 7, 2024

Uh oh!

Pokom Mar 18, 2024

Uh oh!

Pokom commented Nov 10, 2023

Uh oh!

SuperQ Mar 7, 2024

Uh oh!

SuperQ Apr 14, 2024

Uh oh!

Uh oh!

Uh oh!

	h.logger.Log("msg", "Enabling custom logger for promhttp")
	level.Info(h.logger).Log("msg", "Enabling custom logger for promhttp")

	opts := promhttp.HandlerOpts{}
	opts := promhttp.HandlerOpts{ErrorLog: stdlog.New(log.NewStdlibAdapter(level.Error(h.logger)), "", 0)}

feat(stackdriver_exporter): Add ErrorLogger for promhttp #277

feat(stackdriver_exporter): Add ErrorLogger for promhttp #277

Uh oh!

Conversation

Pokom commented Nov 3, 2023

Uh oh!

kgeckhart left a comment

Choose a reason for hiding this comment

Uh oh!

kgeckhart Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

SuperQ Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

Pokom Mar 18, 2024

Choose a reason for hiding this comment

Uh oh!

Pokom commented Nov 10, 2023

Uh oh!

SuperQ Mar 7, 2024

Choose a reason for hiding this comment

Uh oh!

SuperQ Apr 14, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!