feat(iceberg): Support iceberg Hash function #1020

jinchengchenghh · 2025-07-07T02:43:44Z

The iceberg hash use mumur3 hash, which aligns with https://github.com/aappleby/smhasher/blob/master/src/MurmurHash3.cpp, firstly, process every 4 bytes as a chunk, then process remaining bytes by XOR, sparksql also uses this hash algorithm but is different with processing remaining bytes, which combine the remaining bytes. Extract the common function hashInt64.
The iceberg mumur3 hash should be strictly with java implementation, then write by iceberg could read with iceberg Java, and the function call can also get the correct result.
The iceberg utility lib velox_functions_iceberg_util will be linked by iceberg connector write to do partition transform. facebookincubator#13874

jinchengchenghh added 3 commits July 7, 2025 10:35

support iceberg Hash function

e612d1f

Move hash to util

594ce3a

refactor iceberg util

8c488c3

jinchengchenghh closed this Jul 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(iceberg): Support iceberg Hash function #1020

feat(iceberg): Support iceberg Hash function #1020

Uh oh!

jinchengchenghh commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat(iceberg): Support iceberg Hash function #1020

feat(iceberg): Support iceberg Hash function #1020

Uh oh!

Conversation

jinchengchenghh commented Jul 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant