Skip to content

receive: add --tsdb.flush-blocks-on-shutdown flag#8753

Open
prymitive wants to merge 3 commits intothanos-io:mainfrom
prymitive:tsdb.flush-blocks-on-shutdown
Open

receive: add --tsdb.flush-blocks-on-shutdown flag#8753
prymitive wants to merge 3 commits intothanos-io:mainfrom
prymitive:tsdb.flush-blocks-on-shutdown

Conversation

@prymitive
Copy link
Copy Markdown
Contributor

@prymitive prymitive commented Apr 7, 2026

By default Receive will flush (write to disk and/or upload to S3) on shutdown. This takes ages and we don't need it at all because we run Thanos on bare metal. Allow this to be configurable since this feature seems to be aimed mostly at environments like k8s, where instances ephemeral.

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

By default Receive will flush (write to disk and/or upload to S3) on shutdown. This takes ages and we don't need it at all because we run Thanos on bare metal.
Allow this to be configurable since this feature seems to be aimed mostly at environments like k8s, where instances ephemeral.

Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>
@prymitive prymitive force-pushed the tsdb.flush-blocks-on-shutdown branch from a127638 to 3d837b3 Compare April 7, 2026 09:07
MichaHoffmann
MichaHoffmann previously approved these changes Apr 7, 2026
Signed-off-by: Lukasz Mierzwa <lukasz@cloudflare.com>
@prymitive prymitive force-pushed the tsdb.flush-blocks-on-shutdown branch from 524cc51 to c3b54bd Compare April 7, 2026 09:24
MichaHoffmann
MichaHoffmann previously approved these changes Apr 7, 2026
Copy link
Copy Markdown
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to do maintenance on physical servers without dumping blocks on shut down? Is there any API you use or how do you do it? Please elaborate.

@prymitive
Copy link
Copy Markdown
Contributor Author

How to do maintenance on physical servers without dumping blocks on shut down? Is there any API you use or how do you do it? Please elaborate.

I don’t think I follow.
Is this question assuming that there is no replication and everything is down while you take one instance out?
Prometheus doesn’t write blocks on shutdown and yet people run it on many physical servers.

Comment thread docs/components/receive.md Outdated
histograms. This flag is a no-op now and will
be removed in the future. Native histogram
ingestion is always enabled.
--tsdb.flush-blocks-on-shutdown
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run make docs but that made a ton of changes to a lot of files.
Either my local clone did something wrong or all docs are very stale.

Copy link
Copy Markdown
Contributor

@MichaHoffmann MichaHoffmann Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets ignore the docs check, ill add a followup PR to sync the docs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was my laptop, had older thanos binary in my PATH and make build puts binaries in go/bin for some reason

@GiedriusS
Copy link
Copy Markdown
Member

GiedriusS commented Apr 7, 2026

How to do maintenance on physical servers without dumping blocks on shut down? Is there any API you use or how do you do it? Please elaborate.

I don’t think I follow. Is this question assuming that there is no replication and everything is down while you take one instance out? Prometheus doesn’t write blocks on shutdown and yet people run it on many physical servers.

With replication factor 1 or 2 (or any other RF really) and with this flag enabled, how to wipe/destroy a server without losing any recently written data?

@prymitive
Copy link
Copy Markdown
Contributor Author

With replication factor 1 or 2 (or any other RF really) and with this flag enabled, how to wipe/destroy a server without losing any recently written data?

I imagine that you would:

  • stop writing to that server
  • let it sit there until it writes a block and uploads it to object storage
  • remove that instance

@GiedriusS
Copy link
Copy Markdown
Member

With replication factor 1 or 2 (or any other RF really) and with this flag enabled, how to wipe/destroy a server without losing any recently written data?

I imagine that you would:

* stop writing to that server

* let it sit there until it writes a block and uploads it to object storage

* remove that instance

So, at that point, you only have to do very few checks before shutting down because all uploads have already succeeded, no?

@prymitive
Copy link
Copy Markdown
Contributor Author

So, at that point, you only have to do very few checks before shutting down because all uploads have already succeeded, no?

In a theoretical scenario that I haven't actually tried, yes.
But there's a lot of ifs here, for example there's an assumption that I'm using object storage - but I don't have to, I can have a bare metal setup that's simply used for sharding
that doesn't upload anything anywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants