Commit abb5bff
committed
refactor: replace Vec<usize> dict key collection with Arrow eq/gt_eq/lt kernels
Address maintainer review feedback on PR #2423:
1. Eliminate eager Vec<usize> materialization of dictionary keys. Instead,
use the Arrow array directly with eq/gt_eq/lt compute kernels and
BitSliceIterator for efficient dict value index → row mapping.
2. Replace unreachable!() with proper Error::UnsupportedDictionaryKeyType
variant for invalid dictionary key types.
3. Add separate UInt8 and UInt16 code paths to avoid type erasure overhead.
UInt8 keys are handled via find_rename_collisions_dict_u8(); UInt16 keys
use dict_value_range_to_row_mask() helper with BitSliceIterator.
4. Add test_rename_collision_with_real_delete_dict_u16 test exercising the
UInt16 dictionary key code path.
Performance: the previously-regressed dict_keys/single_replace_no_deletes
benchmark now shows -60% to -69% improvement (was +172% regression).
Signed-off-by: Gyanranjan Panda <gyanranjanpanda438@gmail.com>1 parent 1fb1c23 commit abb5bff
1 file changed
Lines changed: 358 additions & 49 deletions
0 commit comments