Skip to content

chore: fix missleading errors total grafana #969

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jmichalek132
Copy link

Simplest, but not necessarily the best way to fix #968 968

@@ -529,7 +529,7 @@
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"expr": "sum(rate(pyrra_requests_total{slo=\"$slo\"}[$__rate_interval]))",
"expr": "sum(rate(pyrra_requests:rate5m{slo=\"$slo\"}[$__rate_interval]))",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make that much sense anymore. Given we already have the :rate5m we don't need to do anything except pyrra_requests:rate5m{slo=\"$slo\"}.
I guess we'll lose the flexibility to use rate[$__rate_interval] but being correct is more important.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have multiple rate interval pre-computed i.e. rate5m, rate1h, rate1d, how about that?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@metalmatze @jmichalek132 Hi! Trying to revive this PR, that seems really important. I am aware of the importance of __rate_interval, but is it something that would give more consistency compared to using only rate5m? I would personally try the rate-5m road to bring those panel in a correct state, and then improve them in the future if possible :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always thought that the :rate5m is too inflexible for looking at longer time ranges in Grafana. Let's say someone has been looking at the RED metrics for the last 90 days.
Having the usual case, like looking at RED metrics for the last 24 hours, is wrong, though, and it is not worth having that flexibility. Agreed.
Given that I still haven't come up with a better solution after all this time, let's go with hardcoding :rate5m.

@@ -598,7 +598,7 @@
}
]
},
"unit": "percentunit"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the RED (Requests, Error Rate, Duration) type of panels, how about actually using the error rate percentage?

@jmichalek132 jmichalek132 marked this pull request as ready for review December 10, 2023 13:30
@metalmatze
Copy link
Member

I'm looking at this again.
The way it's handled today is definitely incorrect.

Given that we do similar things for the other queries, I wonder if we can construct a query (using the objectiveReplacer) that renames the metric to the generic name and either keeps all the labels as they are or somehow only keeps the distinct labels. This would allow us to be flexible with the queries later on.

I prefer to keep only distinct labels; however, keeping all labels and then later summing them away doesn't seem too bad.

@elukey
Copy link

elukey commented May 19, 2025

I'm looking at this again. The way it's handled today is definitely incorrect.

Given that we do similar things for the other queries, I wonder if we can construct a query (using the objectiveReplacer) that renames the metric to the generic name and either keeps all the labels as they are or somehow only keeps the distinct labels. This would allow us to be flexible with the queries later on.

I prefer to keep only distinct labels; however, keeping all labels and then later summing them away doesn't seem too bad.

my 2c: I'd personally vote for the simple solution first (rate-5m, if no strong motivations against it rise) and then something more complex like the above solution.

@metalmatze
Copy link
Member

I agree, @elukey.
Let's do the rate5m and hardcode it.
It's been so long, and I have been thinking about this ever since, but I have not come up with a better solution.

@metalmatze metalmatze merged commit 4d97229 into pyrra-dev:main May 24, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pyrra_errors_total inaccurate in Grafana dashboard
3 participants