Conversation
It provides no value.
Jot down some of my thoughts and share it with others.
xiu
left a comment
There was a problem hiding this comment.
I think this makes sense. I understand the use of the queue to lessen the coupling between the planner and converter though I wonder whether we intend the planner to run over a small set of converters or rather several block streams.
If running colocated in the same Kubernetes cluster say, would it be simpler to have the planner give a short list of blocks to convert (2-3) via gRPC to each converter after discovery, committing the state to S3 so to be able to recover in case of restart or so.
|
The whole concept makes sense. If a worker is unavailable for X minutes, planner marks it's work items as failed and reports them as available again. If a worker is done with the conversion, it just reports it as successful to the planner and planner removes the item from the queue. Also as a consideration: would a worker only work with one objstore directory? This also means we'd have to validate if a worker can do the work that is given by the planner, right? I was considering if we could have workers being universal, only requiring multiple objstore configs to be able to accept work from multiple planners, but IMO that brings too much complexity(?) |
|
Michael shared this with me https://turbopuffer.com/blog/object-storage-queue, we can use it for inspiration. |
Jotting down some of my thoughts and sharing with others. Maybe they are stupid, maybe they are not. Let me know your thoughts.