Skip to content

Improve PageProcessor retained bytes calculations #25602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pettyjamesm
Copy link
Member

@pettyjamesm pettyjamesm commented Apr 16, 2025

Description

Follows up from #25600 by account for all necessary components of SourcePage implementations in SourcePage#getRetainedBytes() and SourcePage#retainedBytesForEachPart methods. Also includes SelectedPositions and Block[] previouslyComputedResults in the PageProcessor.ProjectSelectedPositions retained size calculation.

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

@cla-bot cla-bot bot added the cla-signed label Apr 16, 2025
@github-actions github-actions bot added iceberg Iceberg connector hive Hive connector labels Apr 16, 2025
@pettyjamesm pettyjamesm force-pushed the add-shallow-size-to-sourcepage branch from 42a58be to 9a04414 Compare April 16, 2025 18:04
@pettyjamesm pettyjamesm marked this pull request as ready for review April 16, 2025 18:22
@wendigo
Copy link
Contributor

wendigo commented Apr 16, 2025

@martint I think we should also land this before the release

@wendigo wendigo requested a review from martint April 16, 2025 19:18
@raunaqmorarka
Copy link
Member

@martint I think we should also land this before the release

I don't think this is problematic enough to block the release

@wendigo
Copy link
Contributor

wendigo commented Apr 16, 2025

@raunaqmorarka we've been blocked for 2 weeks on that, 1 or 2 more days doesn't make a difference

@pettyjamesm pettyjamesm force-pushed the add-shallow-size-to-sourcepage branch 2 times, most recently from 976ce28 to 92d8e30 Compare April 16, 2025 20:03
@pettyjamesm pettyjamesm requested a review from wendigo April 17, 2025 14:29
Copy link
Member

@dain dain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me. My main question is can we avoid adding the new method if we update the retainedBytesForEachPart methods to report "this" size and sizes for internal block arrays and such. This is how RLE block works for example.

@pettyjamesm pettyjamesm force-pushed the add-shallow-size-to-sourcepage branch from 92d8e30 to 94616f2 Compare April 18, 2025 19:20
@pettyjamesm pettyjamesm changed the title Add shallow size to SourcePage Fix PageProcessor retained bytes calculations Apr 18, 2025
@pettyjamesm pettyjamesm force-pushed the add-shallow-size-to-sourcepage branch from 94616f2 to c0e0648 Compare April 18, 2025 19:30
@pettyjamesm
Copy link
Member Author

My main question is can we avoid adding the new method if we update the retainedBytesForEachPart methods to report "this" size and sizes for internal block arrays and such. This is how RLE block works for example.

Yeah, that approach reduces the surface area of the SourcePage interface so I've updated the PR to do that instead. I've also added logic to include some other PageProcessor fields that previously weren't being considered in the retained size calculation.

@pettyjamesm pettyjamesm changed the title Fix PageProcessor retained bytes calculations Improve PageProcessor retained bytes calculations Apr 18, 2025
@pettyjamesm pettyjamesm force-pushed the add-shallow-size-to-sourcepage branch 3 times, most recently from c200d06 to 094fe0b Compare April 21, 2025 19:17
Ensures that all components of SourcePage implementations are accounted
for in their retained bytes and retained bytes for each parts
calculations to avoid under-counting the amount of memory being
retained.

Also ensures that PageProcessor selected positions and
previouslyComputedResults array is included in the retained size.
@pettyjamesm pettyjamesm force-pushed the add-shallow-size-to-sourcepage branch from 094fe0b to afabcad Compare April 21, 2025 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed hive Hive connector iceberg Iceberg connector
Development

Successfully merging this pull request may close these issues.

4 participants