Skip to content

feat(storage): hilbert recluster support stream block writer #17904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

zhyass
Copy link
Member

@zhyass zhyass commented May 8, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR enhances Hilbert clustering in Databend by introducing block streaming write support and a new data exchange method: modulo-based Flight Scatter. Hilbert clustering now uses the expression

(range_id * node_num) / partition_num

to compute the scatter key, ensuring that adjacent data ranges are co-located on the same node and thus preserving their continuity.

  • Block Streaming Write Support: Enables writing data blocks in a streaming fashion during reclustering, improving efficiency and reducing memory usage.
  • Modulo-Based Data Exchange: Introduces a data exchange strategy that uses a modulo operation on a specified expression to distribute data across nodes.

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@zhyass zhyass marked this pull request as draft May 8, 2025 18:53
@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label May 8, 2025
@zhyass zhyass added the ci-cloud Build docker image for cloud test label May 8, 2025
Copy link
Contributor

github-actions bot commented May 8, 2025

Docker Image for PR

  • tag: pr-17904-7cdb907-1746732932

note: this image tag is only available for internal use.

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels May 9, 2025
Copy link
Contributor

github-actions bot commented May 9, 2025

Docker Image for PR

  • tag: pr-17904-9735f52-1746791529

note: this image tag is only available for internal use.

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels May 10, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17904-946f096-1746850349

note: this image tag is only available for internal use.

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels May 13, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17904-6e3d5bf-1747248930

note: this image tag is only available for internal use.

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels May 15, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17904-608ae0d-1747330998

note: this image tag is only available for internal use.

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels May 19, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17904-3b6dbbd-1747632478

note: this image tag is only available for internal use.

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels May 20, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17904-d70ebec-1747767991

note: this image tag is only available for internal use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant