-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Support reading multiple regions with single fetch #118496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Support reading multiple regions with single fetch #118496
Conversation
Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, I wonder about the case where we have regions that are already filled?
regionRange, | ||
regionsListener.acquire() | ||
); | ||
if (regionGaps.isEmpty() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should break it up if we hit a region that needs no data read? We could in principle risk having everything but the first and last region of a file and thus we could download an excess amount of extra data.
We can also build this into the shared stream handler.
Or forego it for now. I'd be ok with that too, but wanted to get views on it.
@@ -1082,7 +1205,29 @@ void populateAndRead( | |||
} | |||
} | |||
|
|||
List<Runnable> gapFillingTasks( | |||
int gapOffset, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not the region offset more than the gap offset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more comment.
) { | ||
ArrayList<Runnable> gapFillingTasks = new ArrayList<>(); | ||
for (Tuple<CacheFileRegion<KeyType>, RegionGaps> gapsToFetch : gaps) { | ||
int offset = Math.toIntExact(gapsToFetch.v2().regionOffset()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unsafe, I think we need this offset to be a long and this has to pass through all the way to relativePos
in SourceInputStreamFactory.create(relativePos, ...)
?
This commit adds support for filling multiple regions in blob cache with
a single fetch from the object store.