Skip to content

Conversation

@thuongle2210
Copy link
Contributor

@thuongle2210 thuongle2210 commented Dec 25, 2025

Description

Resolves issues-546 .

This PR introduces a pluggable mem-buffer mechanism, and based on this, this also introduces a optimized staging lookup implementaion of mem-buffer.

Changelogs of optimized staging lookup impl

  1. Converts staging buffer from previous structure to Vec<Block> for contiguous memory access
  2. Adds batch_boundaries vector to track batch start positions
  3. Introduces block_position_index for O(1) block ID to vector index mapping

Performance Impact

  • Improves search performance in staging buffers from O(number of staging blocks) to O(1 + number of selected staging blocks) by eliminating linear scans during get operations
  • Easily extendable with new memory buffer implementations for benchmarking, testing, etc., and switch to the most efficient approach.

@thuongle2210 thuongle2210 marked this pull request as draft December 25, 2025 01:08
@thuongle2210
Copy link
Contributor Author

hi @zuston , could you pls help me review this PR?

@thuongle2210 thuongle2210 marked this pull request as ready for review December 25, 2025 01:40
Copy link
Owner

@zuston zuston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your great work, @thuongle2210 — this is definitely moving in the right direction 👍

Before merging this, I’d like to make the buffer implementation more pluggable.
Would you mind abstracting the buffer into a trait, and then introducing different implementations on top of it?

I’m happy to bring in the optimization in this part of the codebase, but this path is critical for our internal shuffle service. Before rolling it out, I plan to run long-term performance tests as well as partial online bandwidth benchmarks. Based on the results, having a pluggable design will give us much more flexibility to iterate and tune.

Thanks again for the solid progress — looking forward to your thoughts on this.

@thuongle2210
Copy link
Contributor Author

thuongle2210 commented Dec 25, 2025

Hi @zuston, do you mean I should define a trait—for example, MemoryBufferTrait (consisting of functions like total_size, flight_size, spill, append, get, etc.)—and implement it for MemoryBuffer, plus create a new OptStagingMemoryBuffer struct?

@zuston
Copy link
Owner

zuston commented Dec 25, 2025

Hi @zuston, do you mean I should define a trait—for example, MemoryBufferTrait (consisting of functions like total_size, flight_size, spill, append, get, etc.)—and implement it for MemoryBuffer, plus create a new OptStagingMemoryBuffer struct?

Yes. Please go ahead 🛫

@thuongle2210
Copy link
Contributor Author

Hi @zuston, I made the buffer implementation more pluggable. Could you please review it?

@thuongle2210 thuongle2210 changed the title feat: optimize buffer get action feat: make buffer implementation pluggable and optimize get action Dec 26, 2025
@zuston
Copy link
Owner

zuston commented Dec 26, 2025

Hi @zuston, I made the buffer implementation more pluggable. Could you please review it?

I have checked this PR, thanks for your great work! Overall lgtm. But for the better pluggable mechanism, I have refactored something. Could you help take a look? @thuongle2210

@zuston zuston changed the title feat: make buffer implementation pluggable and optimize get action feat: Introduce a pluggable mem-buffer mechanism and an optimized staging buffer impl Dec 26, 2025
@zuston zuston changed the title feat: Introduce a pluggable mem-buffer mechanism and an optimized staging buffer impl feat(memory): Introduce a pluggable mem-buffer mechanism and an optimized staging buffer impl Dec 26, 2025
@thuongle2210
Copy link
Contributor Author

LGTM! @zuston

@zuston
Copy link
Owner

zuston commented Dec 26, 2025

thanks for your contribution! @thuongle2210 Let's merge this. 🎉

@zuston zuston merged commit 27576c9 into zuston:master Dec 26, 2025
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use the linkedhashmap to speed up the memory reading

2 participants