Description
What is the bug?
We have noticed a strange memory usage pattern in the ingesters after restarting:
After comparing profiles of one of the affected instances, we see that Memory allocated is in tsdb.NewCircularExemplarStorage
:
We've narrowed down and saw that ~8GiB were allocated in the slice allocation of tsdb.NewCircularExemplarStorage
, concidentally, this is a cluster with one huge tenant that has a global max of 160M exemplars.
How to reproduce it?
Rolling out ingesters
What did you think would happen?
Since an exemplar is ~56 bytes, and 160M * 56 = 8.960.000.000, we can assume that each instance allocated the slice for all exemplars, instead of dividing the limit by the number of instances properly
So it seems that tsdb was instantiated with empty ring somehow, even though we couldn't see how that can happen.
What was your environment?
Grafana Cloud, latest weekly release, this was staging env.
Any additional context to share?
No response
Activity