Emit metric tracking empty responses from prometheus #7060

drolando-stripe · 2025-09-02T22:24:36Z

We'd like to have a way to monitor the number of Keda errors due to empty responses from prometheus after enabling the ignoreNullValues flag for most of our prometheus triggers.

Right now this error gets logged but the error metric that Keda emits is generic and doesn't differentiate by error type.

Tests

I tried adding an e2e test but they don't actually run prometheus, so all queries fail with dial tcp: lookup keda-prometheus.keda.svc.cluster.local on 10.96.0.10:53: no such host. The only 2 tests in tests/sequential/prometheus_metrics/prometheus_metrics_test.go to use a prometheus trigger only look for errors and don't actually run the query.

I could add a test that shows that the metric exists and is zero, but that doesn't seem very useful.

Checklist

~~When introducing a new scaler, I agree with the scaling governance policy~~ N/A
I have verified that my change is according to the deprecations & breaking changes policy
~~Tests have been added~~
Changelog has been updated and is aligned with our changelog requirements
A PR is opened to update our Helm chart (repo) (if applicable, ie. when deployment manifests are modified) N/A
A PR is opened to update the documentation on (repo) (if applicable) N/A
Commits are signed with Developer Certificate of Origin (DCO - learn more)

Fixes #7062

github-actions · 2025-09-02T22:24:44Z

Thank you for your contribution! 🙏

Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected.

While you are waiting, make sure to:

Add an entry in our changelog in alphabetical order and link related issue
Update the documentation, if needed
Add unit & e2e tests for your changes
GitHub checks are passing
Is the DCO check failing? Here is how you can fix DCO issues

Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient.

Learn more about our contribution guide.

Signed-off-by: Daniele Rolando <[email protected]>

aliaqel-stripe

looks good to me, you should open a PR in the keda-docs repository to update documentation to add this metric. My only question will be is if the maintainers would like this metric to be more generic but like you said, the prometheus scaler is the only scaler with isNullOrEmpty as a configurable field.

@wozniakjan I know you're in charge of cutting this release. Wanted to put this on your radar.

JorTurFer · 2025-09-29T07:59:02Z

Hello
Personally, I'd not merge this as it's exposing some specific scaler info as KEDA metric in a way that only fits for prometheus.
I prefer to do it in a generic way like adding a new metric for empty upstream responses or so. With this, we can cover prometheus but also other scalers with support for empty responses.

tbh, I think that this PR isa great starting point and just updating it a bit we can make it generic

rickbrouwer · 2025-09-29T08:11:12Z

Hello Personally, I'd not merge this as it's exposing some specific scaler info as KEDA metric in a way that only fits for prometheus. I prefer to do it in a generic way like adding a new metric for empty upstream responses or so. With this, we can cover prometheus but also other scalers with support for empty responses.

tbh, I think that this PR isa great starting point and just updating it a bit we can make it generic

Agree

aliaqel-stripe · 2025-09-29T17:06:27Z

pkg/metricscollector/prommetrics.go

+		prometheus.CounterOpts{
+			Namespace: DefaultPromMetricsNamespace,
+			Subsystem: "prometheus",
+			Name:      "metrics_empty_error_total",


@rickbrouwer @JorTurFer Thoughts on naming this empty_upstream_responses_total?

upstream is quite generic, what about scaler/trigger instead?

Signed-off-by: Daniele Rolando <[email protected]>

pkg/metricscollector/prommetrics.go

pkg/metricscollector/opentelemetry.go

Co-authored-by: Jorge Turrado Ferrero <[email protected]> Signed-off-by: drolando-stripe <[email protected]>

JorTurFer

Looking nice! Only one small thing inline about implementation.
Could you include this new metric during prometheus/otel e2e test?

JorTurFer · 2025-10-03T19:50:08Z

pkg/metricscollector/metricscollectors.go

+func RecordEmptyUpstreamResponse() {
+	for _, element := range collectors {
+		element.RecordEmptyUpstreamResponse()
+	}
+}


Let's enrich this metric with information about ScaledJob|ScaledObject and trigger that has recorded the empty response. You can see how it's done with other metric like RecordCloudEventEmittedError

I thought about it. The problem is that the ExecutePromQuery function doesn't have access to the trigger or scaledobject name afaict.

The ScaledObject name and the trigger name are passed to the scaler during the build process as part of scalersconfig.ScalerConfig struct, so you will need to take them and store them in the scaler to have the avalable for the metric but I think that the info is quite valiable to know which scaler is failing

drolando-stripe · 2025-10-03T20:43:09Z

Could you include this new metric during prometheus/otel e2e test?

@JorTurFer I left a note about this in the PR description. The end-to-end tests don't actually run prometheus, so the scaler query always fails with dial tcp: lookup keda-prometheus.keda.svc.cluster.local on 10.96.0.10:53: no such host and since keda runs in a separate container from the tests I cannot mock it to return an empty value

zroubalik · 2025-10-07T09:46:33Z

Could you include this new metric during prometheus/otel e2e test?

@JorTurFer I left a note about this in the PR description. The end-to-end tests don't actually run prometheus, so the scaler query always fails with dial tcp: lookup keda-prometheus.keda.svc.cluster.local on 10.96.0.10:53: no such host and since keda runs in a separate container from the tests I cannot mock it to return an empty value

@drolando-stripe What do you mean by that, please? The existing prom e2e tests use this helper to setup prometheus: https://github.com/kedacore/keda/blob/main/tests/scalers/prometheus/prometheus_helper.go

SpiritZhou · 2025-10-13T06:52:25Z

I see there are other upstream error responses from this scaler. Is an empty response enough for tracking purposes?

keda-automation requested review from a team September 2, 2025 22:24

drolando-stripe force-pushed the drolando/add_empty_response_metric branch 2 times, most recently from 3f967e9 to 4a13d88 Compare September 3, 2025 18:56

drolando-stripe force-pushed the drolando/add_empty_response_metric branch from 7cd8675 to 6f1d422 Compare September 25, 2025 22:13

Emit metric tracking empty responses from prometheus

d642098

Signed-off-by: Daniele Rolando <[email protected]>

drolando-stripe force-pushed the drolando/add_empty_response_metric branch from 6f1d422 to d642098 Compare September 25, 2025 22:15

drolando-stripe marked this pull request as ready for review September 25, 2025 22:15

aliaqel-stripe approved these changes Sep 25, 2025

View reviewed changes

aliaqel-stripe reviewed Sep 29, 2025

View reviewed changes

rename to empty_upstream_responses_total

3ad2203

Signed-off-by: Daniele Rolando <[email protected]>

drolando-stripe force-pushed the drolando/add_empty_response_metric branch from a30f2bc to 3ad2203 Compare September 29, 2025 17:37

update comment

a5aea4a

Signed-off-by: Daniele Rolando <[email protected]>

zroubalik mentioned this pull request Sep 30, 2025

Release: 2.18.0 #6853

Open

22 tasks

JorTurFer reviewed Oct 3, 2025

View reviewed changes

pkg/metricscollector/prommetrics.go Outdated Show resolved Hide resolved

pkg/metricscollector/opentelemetry.go Outdated Show resolved Hide resolved

Update pkg/metricscollector/prommetrics.go

6b67a56

Co-authored-by: Jorge Turrado Ferrero <[email protected]> Signed-off-by: drolando-stripe <[email protected]>

keda-automation requested a review from a team October 3, 2025 19:09

Update pkg/metricscollector/opentelemetry.go

0f5e070

Co-authored-by: Jorge Turrado Ferrero <[email protected]> Signed-off-by: drolando-stripe <[email protected]>

JorTurFer reviewed Oct 3, 2025

View reviewed changes

Emit metric tracking empty responses from prometheus #7060

Are you sure you want to change the base?

Emit metric tracking empty responses from prometheus #7060

Uh oh!

Conversation

drolando-stripe commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tests

Checklist

Uh oh!

github-actions bot commented Sep 2, 2025

Uh oh!

aliaqel-stripe left a comment

Choose a reason for hiding this comment

Uh oh!

JorTurFer commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rickbrouwer commented Sep 29, 2025

Uh oh!

aliaqel-stripe Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

zroubalik Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JorTurFer left a comment

Choose a reason for hiding this comment

Uh oh!

JorTurFer Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

drolando-stripe Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

JorTurFer Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

drolando-stripe commented Oct 3, 2025

Uh oh!

zroubalik commented Oct 7, 2025

Uh oh!

SpiritZhou commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

drolando-stripe commented Sep 2, 2025 •

edited

Loading

JorTurFer commented Sep 29, 2025 •

edited

Loading