-
Notifications
You must be signed in to change notification settings - Fork 124
chore: fix missleading errors total grafana #969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: fix missleading errors total grafana #969
Conversation
examples/grafana/detail.json
Outdated
@@ -529,7 +529,7 @@ | |||
"uid": "${DS_PROMETHEUS}" | |||
}, | |||
"editorMode": "code", | |||
"expr": "sum(rate(pyrra_requests_total{slo=\"$slo\"}[$__rate_interval]))", | |||
"expr": "sum(rate(pyrra_requests:rate5m{slo=\"$slo\"}[$__rate_interval]))", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't make that much sense anymore. Given we already have the :rate5m
we don't need to do anything except pyrra_requests:rate5m{slo=\"$slo\"}
.
I guess we'll lose the flexibility to use rate[$__rate_interval]
but being correct is more important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have multiple rate interval pre-computed i.e. rate5m, rate1h, rate1d, how about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@metalmatze @jmichalek132 Hi! Trying to revive this PR, that seems really important. I am aware of the importance of __rate_interval
, but is it something that would give more consistency compared to using only rate5m? I would personally try the rate-5m road to bring those panel in a correct state, and then improve them in the future if possible :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always thought that the :rate5m
is too inflexible for looking at longer time ranges in Grafana. Let's say someone has been looking at the RED metrics for the last 90 days.
Having the usual case, like looking at RED metrics for the last 24 hours, is wrong, though, and it is not worth having that flexibility. Agreed.
Given that I still haven't come up with a better solution after all this time, let's go with hardcoding :rate5m
.
@@ -598,7 +598,7 @@ | |||
} | |||
] | |||
}, | |||
"unit": "percentunit" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the RED (Requests, Error Rate, Duration) type of panels, how about actually using the error rate percentage?
I'm looking at this again. Given that we do similar things for the other queries, I wonder if we can construct a query (using the I prefer to keep only distinct labels; however, keeping all labels and then later summing them away doesn't seem too bad. |
my 2c: I'd personally vote for the simple solution first (rate-5m, if no strong motivations against it rise) and then something more complex like the above solution. |
I agree, @elukey. |
Simplest, but not necessarily the best way to fix #968 968