Skip to content

Scalable conversion RFC#62

Open
GiedriusS wants to merge 2 commits intomainfrom
scalable_conversion
Open

Scalable conversion RFC#62
GiedriusS wants to merge 2 commits intomainfrom
scalable_conversion

Conversation

@GiedriusS
Copy link
Copy Markdown
Member

Jotting down some of my thoughts and sharing with others. Maybe they are stupid, maybe they are not. Let me know your thoughts.

It provides no value.
Jot down some of my thoughts and share it with others.
Copy link
Copy Markdown

@xiu xiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense. I understand the use of the queue to lessen the coupling between the planner and converter though I wonder whether we intend the planner to run over a small set of converters or rather several block streams.
If running colocated in the same Kubernetes cluster say, would it be simpler to have the planner give a short list of blocks to convert (2-3) via gRPC to each converter after discovery, committing the state to S3 so to be able to recover in case of restart or so.

@adezxc
Copy link
Copy Markdown
Contributor

adezxc commented Feb 23, 2026

The whole concept makes sense.
As something that came to my mind, could we have something like: planner is only responsible for having a queue of work that has to be done and all the workers request e.g /api/get_work?size=3 with their metadata e.g IP address/hostname etc. and planner would be sending heartbeats to workers (that it knows are assigned for work)?

If a worker is unavailable for X minutes, planner marks it's work items as failed and reports them as available again. If a worker is done with the conversion, it just reports it as successful to the planner and planner removes the item from the queue.

Also as a consideration: would a worker only work with one objstore directory? This also means we'd have to validate if a worker can do the work that is given by the planner, right? I was considering if we could have workers being universal, only requiring multiple objstore configs to be able to accept work from multiple planners, but IMO that brings too much complexity(?)

@GiedriusS
Copy link
Copy Markdown
Member Author

Michael shared this with me https://turbopuffer.com/blog/object-storage-queue, we can use it for inspiration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants