Ingester local tsdb storage does not stop growing #12915
Unanswered
alemsh
asked this question in
Help and support
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
We've been running into a problem with our ingesters continually filling up their local pvc storage (i.e. WAL). Our setup uses an s3 gateway for long term bucket storage of metrics.
I had tried messing around with
blocks-storage.tsdb.retention-periodconfiguration setting, but that ended up causing a number of other headaches, as changing that requires a myriad of other changes that are not well documented what else needs to be updated. Apparently, these defaults should be sufficient so I am not sure why local disk is filling up. My understanding is that metrics should go along a path ofIngester memory -> local disk (WAL) -> long term s3 buckets. My understanding is that after13h` blocks/series in the WAL should be compacted and uploaded to our s3, and that those blocks should be marked for deletion, and the compactor should be deleting them. I can confirm however that WAL blocks that are still sitting on local disk after several days. I realize that those blocks might not necessarily be deleted at 13h, but having them be there several days later seems incorrect.According to monitoring and compactor logs, it seems like things are being deleted, but very sparingly. If I restart the ingesters, it seems like then all the blocks that were marked for deletion actually get deleted on restart, as the amount deletions occurring increases by an order of magnitude, and the disk space usage drops down.
Some stats:
Helm chart version:
mimir-distributed 5.8.0Image version:
grafana/mimir:2.17.0Some cluster metrics
Our helm config:
Beta Was this translation helpful? Give feedback.
All reactions