-
Notifications
You must be signed in to change notification settings - Fork 20
refactor: improve MPUChunk flush logic and Dask finalization #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
@@ -296,7 +299,7 @@ def gen_bunch( | |||
partId + idx * writes_per_chunk, | |||
writes_per_chunk, | |||
is_final=is_final, | |||
lhs_keep=lhs_keep, | |||
lhs_keep=lhs_keep if idx == 0 else 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, every chunk must keep same amount regardless of position relative to other chunks. Good way to think of it is "extremes scenarios": what happens when only one chunks was written to and it did not keep left side bytes, now you need to write a header with too few bytes for a chunk write, and since you didn't keep any lhs
bytes you end up in scenario like this:
HDR{10KiB}, P{id=2}, P{id=7}, ...
You can't write out HDR{10KiB}
even though you reserved PartId=1
for it, because it's too small. That's why we need to keep at least 5MiB (S3 case) bytes on the left, so we always end up with:
HDR{10KiB}, L{5MiB}, P{id=2}, P{id=7}, ...
And we then can always write out HDR{10KiB}+L{5MiB} => P{id=1}
, or alternatively we end up with:
HDR{10KiB}, D{100KiB}
No parts were written out yet, since we only had 100KiB worth of non-header data D{100KiB}
, and so it was never written, so we can just perform
HDR{10KiB} + D{100KiB} => P{id=1}
, or even cancels multi-part upload and generate a single write object instead, since this time P{id=1}
is the last part of the upload it is allowed to be any size.
@@ -65,6 +66,7 @@ class MPUChunk: | |||
"observed", | |||
"is_final", | |||
"lhs_keep", | |||
"__dict__", | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
@@ -390,33 +389,29 @@ def mpu_write( | |||
partId, | |||
ch, | |||
writes_per_chunk=writes_per_chunk, | |||
lhs_keep=lhs_keep, | |||
lhs_keep=lhs_keep if idx == 0 else 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, see other comment about lhs_keep
what we need are more tests for I'm pretty sure that We really must change I mentioned before, I really don't think this whole In the case of COG generation we know up-front how many logical parts there are, even if some of the parts will end up being empty, so no need to use Bag, which is for "don't yet know how many elements in it". Also you still have not explained to me exactly what problems YOU are having with this code in what scenarios. |
Thanks for the rigorous feedback; that's very helpful to refactor the implementation, I'll follow up. The problem is with writing large COGs (50k*50k-ish) with multiple overview levels. This triggers an AssertionError in MPUChunk.flush due to small left_data during finalization: assert len(self.left_data) >= write.min_write_sz within MPUChunk.flush when finalise=True. |
Other notes: run tests on your dev machine, please, tests are failing on this PR, for code that was changed by this PR. Also linting is failing too. |
It enables writing sizeable COGs more flexibly and solves failures for sizeable COGs with many overviews on Azure.
The multi-part upload process has been improved with refined data flushing logic. Instead of a single flush, data is collected in a local buffer and sent in smaller chunks based on write size and available credits.
Error handling has also been enhanced, providing clearer messages for problems like insufficient write credits or invalid part IDs. This ensures immediate notifications of write failures.
The finalization process can now be executed in a distributed manner using Dask. It retrieves a Dask client, submits the flush as a remote task with a timeout, and can manage cancellations, resulting in a more reliable final output assembly.