Skip to content

feat: Enable the hash join to accept a pre-built hash table for joining#1661

Open
JkSelf wants to merge 1 commit intooss-mainfrom
reuse-hashtable
Open

feat: Enable the hash join to accept a pre-built hash table for joining#1661
JkSelf wants to merge 1 commit intooss-mainfrom
reuse-hashtable

Conversation

@JkSelf
Copy link
Collaborator

@JkSelf JkSelf commented Jan 29, 2026

In Spark, the hash table is constructed only once in the driver and then broadcasted to each executor. In Gluten, we replace the broadcast hash join with a hash join, which leads to performance issues when the broadcast threshold increases. This is because each task must build its own hash table when using hash join.

This PR enables HashBuild to accept pre-built HashTable, thereby bypassing the hash table construction process. Additionally, it modifies HashProbe to avoid clearing the shared hash table after use.

@zhouyuan
Copy link
Member

alchemy merge

@prestodb-ci
Copy link
Collaborator

alchemy link 8ca7ac1

@prestodb-ci
Copy link
Collaborator

Added new rebase item:

This was referenced Feb 25, 2026
This was referenced Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants