Is high error rate during rollouts of thanos receive expected? #4277
Unanswered
jmichalek132
asked this question in
Questions & Answers
Replies: 2 comments 3 replies
-
Replication errors during updates are expected, however they shouldn't surface to the remote-write responses if you have enough healthy receive nodes. With your setup (6 nodes, replication factor 2), you should tolerate 2/6 down nodes. Do you have Pod Disruption Budgets set? https://github.com/thanos-io/kube-thanos/blob/f53ad9856c6f765989ea76ba8eff8dd1e77186b7/jsonnet/kube-thanos/kube-thanos-receive.libsonnet#L224 |
Beta Was this translation helpful? Give feedback.
3 replies
-
Thanks for this! Some ideas during our Contributor Hours:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I wanted to ask whether high error rate during a rollout of thanos receive is expected?
When triggering a rollout of thanos receive for e.g. by doing
kubectl rollout restart statefulset thanos-receive-staging
, we experience high error rate on all layers (http error rate, replication error rate and forward request error rate).Screenshot of metrics during a rollout:
Errors in log of thanos-receive-default:
Errors in log of thanos-receive-staging:
Our deployment.
Configuration of thanos receive default:
The one for thanos-receive is almost the same with exception of necessary modifications such as name of the statefulset etc.
Hashring json config:
Beta Was this translation helpful? Give feedback.
All reactions