Skip to content

Fix duplicate attribute keys in transform_attributes#2423

Open
gyanranjanpanda wants to merge 8 commits intoopen-telemetry:mainfrom
gyanranjanpanda:fix-duplicate-attributes-1650
Open

Fix duplicate attribute keys in transform_attributes#2423
gyanranjanpanda wants to merge 8 commits intoopen-telemetry:mainfrom
gyanranjanpanda:fix-duplicate-attributes-1650

Conversation

@gyanranjanpanda
Copy link
Copy Markdown
Contributor

@gyanranjanpanda gyanranjanpanda commented Mar 24, 2026

Fix Duplicate Attribute Keys in transform_attributes

Changes Made

This PR resolves issue #1650 by ensuring that dictionary keys are deduplicated when transformations such as rename are applied, as required by the OpenTelemetry specification ("Exported maps MUST contain only unique keys by default").

To accomplish this while maintaining strict performance requirements, we replaced the previous RowConverter deduplication strategy with a new high-performance, proactive pre-filter:

  • We injected filter_rename_collisions into transform_attributes_impl inside otap-dataflow/crates/pdata/src/otap/transform.rs.
  • Before a rename is processed, this function reads the parent_ids and target keys. It uses the IdBitmap type to find any existing target keys whose parent_id maps back to an old key that will be renamed.
  • It proactively strips those collision rows from the batch via arrow::compute::filter_record_batch before the actual transform happens.

Testing

  • Extended the AttributesProcessor unit tests (test_rename_removes_duplicate_keys) to explicitly verify that renaming an attribute resulting in a collision automatically discards duplicate keys.
  • Extended the AttributesTransformPipelineStage in query-engine tests with a parallel case ensuring OPL/KQL query pipelines (project-rename) properly drop duplicates when resolving duplicates.
  • Refactored otap_df_pdata transform.rs tests to properly expect deduplicated keys using this plan-based method.
  • Validated logic with cargo test --workspace --all-features.

Validation Results

All tests pass. OTel semantic rules surrounding unique mapped keys map cleanly through down/upstream processors. The IdBitmap intersection approach completely resolves the multi-thousand percent RowConverter performance regressions, dropping collision resolution overhead to essentially zero through efficient bitmap operations.

@gyanranjanpanda gyanranjanpanda requested a review from a team as a code owner March 24, 2026 20:42
@github-actions github-actions Bot added rust Pull requests that update Rust code query-engine Query Engine / Transform related tasks query-engine-columnar Columnar query engine which uses DataFusion to process OTAP Batches labels Mar 24, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 24, 2026

Codecov Report

❌ Patch coverage is 89.37785% with 70 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.21%. Comparing base (9c54c8e) to head (1fb1c23).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2423      +/-   ##
==========================================
- Coverage   88.23%   88.21%   -0.02%     
==========================================
  Files         639      639              
  Lines      242568   243089     +521     
==========================================
+ Hits       214018   214451     +433     
- Misses      28026    28114      +88     
  Partials      524      524              
Components Coverage Δ
otap-dataflow 89.87% <89.37%> (-0.02%) ⬇️
query_abstraction 80.61% <ø> (ø)
query_engine 90.75% <ø> (+<0.01%) ⬆️
otel-arrow-go 52.45% <ø> (ø)
quiver 92.25% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gyanranjanpanda gyanranjanpanda force-pushed the fix-duplicate-attributes-1650 branch 2 times, most recently from a210873 to 361e6bd Compare March 24, 2026 22:43
@gyanranjanpanda
Copy link
Copy Markdown
Contributor Author

@albertlockett and @ThomsonTan waiting for your feed backl

Copy link
Copy Markdown
Member

@albertlockett albertlockett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gyanranjanpanda . I appreciate you taking the time to look at this, but I don't think we can accept this PR as is.

Unfortunately, the benchmarks we have for this code on main are currently broken. But when I apply the fix from #2426 and run the benchmark we see that this change introduces significant performance regression:

transform_attributes_dict_keys/single_replace_no_deletes/keys=32,rows=128,rows_per_key=4
                        time:   [5.1300 µs 5.1348 µs 5.1394 µs]
                        change: [+1027.4% +1031.5% +1035.2%] (p = 0.00 < 0.05)
                        Performance has regressed.


transform_attributes_dict_keys/single_replace_single_delete/keys=32,rows=128,rows_per_key=4
                        time:   [5.5027 µs 5.5091 µs 5.5155 µs]
                        change: [+495.01% +497.37% +499.48%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/no_replace_single_delete/keys=32,rows=128,rows_per_key=4
                        time:   [5.3440 µs 5.3584 µs 5.3746 µs]
                        change: [+577.41% +580.27% +583.40%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/single_replace_no_deletes/keys=32,rows=1536,rows_per_key=48
                        time:   [34.015 µs 34.050 µs 34.086 µs]
                        change: [+4000.2% +4016.4% +4031.3%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/single_replace_single_delete/keys=32,rows=1536,rows_per_key=48
                        time:   [34.390 µs 34.472 µs 34.562 µs]
                        change: [+1421.9% +1433.5% +1443.9%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/no_replace_single_delete/keys=32,rows=1536,rows_per_key=48
                        time:   [34.302 µs 34.340 µs 34.379 µs]
                        change: [+1562.1% +1568.0% +1573.6%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/single_replace_no_deletes/keys=32,rows=8192,rows_per_key=256
                        time:   [171.62 µs 171.78 µs 171.96 µs]
                        change: [+6262.2% +6290.6% +6316.2%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/single_replace_single_delete/keys=32,rows=8192,rows_per_key=256
                        time:   [171.79 µs 171.92 µs 172.06 µs]
                        change: [+1771.2% +1835.7% +1893.0%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/no_replace_single_delete/keys=32,rows=8192,rows_per_key=256
                        time:   [171.20 µs 171.35 µs 171.49 µs]
                        change: [+1962.8% +1981.5% +1998.1%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/single_replace_no_deletes/keys=128,rows=128,rows_per_key=1
                        time:   [4.9566 µs 4.9693 µs 4.9819 µs]
                        change: [+587.52% +592.02% +597.47%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/single_replace_single_delete/keys=128,rows=128,rows_per_key=1
                        time:   [5.6185 µs 5.6284 µs 5.6377 µs]
                        change: [+292.54% +294.19% +296.01%] (p = 0.00 < 0.05)
                        Performance has regressed.

transform_attributes_dict_keys/no_replace_single_delete/keys=128,rows=128,rows_per_key=1
                        time:   [5.2733 µs 5.2831 µs 5.2938 µs]
                        change: [+385.50% +387.73% +389.92%] (p = 0.00 < 0.05)
                        Performance has regressed.

While I expect to see some performance regression because we're doing extra work, I feel that such a serious regression in performance warrants some additional investigation into if/how we can do this in a more efficient way.

Please see my comment here which prescribes an approach that I believe will be more performant than what is currently in this PR: #1650 (comment)

@gyanranjanpanda gyanranjanpanda force-pushed the fix-duplicate-attributes-1650 branch from 361e6bd to 06392eb Compare March 24, 2026 22:57
@gyanranjanpanda
Copy link
Copy Markdown
Contributor Author

thanks for your wonderful guidance i will make sure i could match your expectation

@albertlockett
Copy link
Copy Markdown
Member

Hey @gyanranjanpanda I wanted to give you a heads up that I am going to be working on #2014 and there may be some significant changes to the transform_attributes code. I will be touching code in transform_keys as well as transform_attributes_impl. I wanted to give you a heads up in case you want to hold off advancing your work until you can better understand the conflicts

@gyanranjanpanda
Copy link
Copy Markdown
Contributor Author

Thanks for the heads up! I’ll keep an eye on your changes to #2014 and try to align my work accordingly. If possible, could you share which parts might be most affected so I can avoid overlap? or should i wait after u finished your work i should continue this work

@albertlockett
Copy link
Copy Markdown
Member

Thanks for the heads up! I’ll keep an eye on your changes to #2014 and try to align my work accordingly. If possible, could you share which parts might be most affected so I can avoid overlap? or should i wait after u finished your work i should continue this work

It's probably easiest to hold off until I finish to avoid conflicts, but I'll leave it up to you. I think I should have the changes I need to make for #2014 done by early next week, if not sooner.

For now, I'll show you the in-progress changes:
https://github.com/open-telemetry/otel-arrow/compare/main...albertlockett:otel-arrow:albert/2014?expand=1

I was imagining that for #1650 you'd need to make changes to plan_key_replacements or plan_key_deletes (which actually haven't been modified) to produce ranges to be deleted in transform_keys.

@albertlockett
Copy link
Copy Markdown
Member

@gyanranjanpanda the changes I mentioned that could cause conflicts have now been merged (see #2442)

@gyanranjanpanda
Copy link
Copy Markdown
Contributor Author

i will fix this code as soon as possible while looking your merged pr

@gyanranjanpanda gyanranjanpanda force-pushed the fix-duplicate-attributes-1650 branch 4 times, most recently from 2d813af to 67e366e Compare March 31, 2026 19:47
@gyanranjanpanda
Copy link
Copy Markdown
Contributor Author

@albertlockett Thanks for the detailed benchmark feedback! I have completely reworked the approach based on your guidance.

What changed:

  • Replaced the old RowConverter + filter_record_batch approach with a plan-based collision detection strategy
  • Uses IdBitmap (as you suggested) to efficiently detect rename collisions in O(N)
  • Generates KeyTransformRange::Delete entries that are merged into the existing transform_keys / transform_dictionary_keys pipeline
  • No physical batch filtering — collision rows are skipped naturally during materialization
  • Only runs collision detection when parent_ids are plain-encoded (skips transport-optimized batches)

Benchmark results (no regression):

Benchmark Old PR (your review) Current PR
single_replace_no_deletes/keys=32,rows=128 5.13 µs (+1031%) 695 ns
single_replace_single_delete/keys=32,rows=128 5.51 µs (+497%) 1.36 µs
no_replace_single_delete/keys=32,rows=128 5.34 µs (+580%) 1.14 µs
single_replace_no_deletes/keys=32,rows=1536 34.0 µs (+4016%) 1.16 µs
single_replace_single_delete/keys=32,rows=1536 34.5 µs (+1434%) 3.37 µs
no_replace_single_delete/keys=32,rows=1536 34.3 µs (+1568%) 3.14 µs

The plan-based approach avoids the expensive RowConverter sorting and physical batch copy entirely. Would love your re-review when you get a chance!

…metry#1650)

When renaming attribute key 'x' to 'y', any existing row with key 'y'
sharing a parent_id with a row having key 'x' would produce a
duplicate. This commit fixes that by:

- Adding find_rename_collisions_to_delete_ranges() which uses IdBitmap
  to efficiently detect these collisions in O(N) time
- Generating KeyTransformRange::Delete entries that are merged into the
  existing transform pipeline in transform_keys() and
  transform_dictionary_keys()
- Fixing an early-return in transform_dictionary_keys() that skipped
  row-level collision deletes when dictionary values had no deletions
- Adding read_parent_ids_as_u32() helper for parent_id column access
- Adding test_rename_removes_duplicate_keys integration test

Only runs collision detection when parent_ids are plain-encoded (not
transport-optimized) to avoid incorrect results from quasi-delta
encoded values.

Closes open-telemetry#1650
@gyanranjanpanda gyanranjanpanda force-pushed the fix-duplicate-attributes-1650 branch from 67e366e to 71f2ee6 Compare March 31, 2026 21:15
Copy link
Copy Markdown
Member

@albertlockett albertlockett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some good progress, but still some things happening that are not as well optimized as they could be

Comment thread rust/otap-dataflow/crates/pdata/src/otap/transform.rs Outdated
Comment thread rust/otap-dataflow/crates/pdata/src/otap/transform.rs Outdated
Comment thread rust/otap-dataflow/crates/pdata/src/otap/transform.rs Outdated
Comment thread rust/otap-dataflow/crates/pdata/src/otap/transform.rs Outdated
Comment thread rust/otap-dataflow/crates/pdata/src/otap/transform.rs Outdated
Comment thread rust/otap-dataflow/crates/pdata/src/otap/transform.rs Outdated
Comment thread rust/otap-dataflow/crates/pdata/src/otap/transform.rs Outdated
Comment thread rust/otap-dataflow/crates/pdata/src/otap/transform.rs Outdated
…anges

Addresses @albertlockett's review feedback:
- Extract sorted_merge_into_vec helper to DRY up sorted merge pattern
- Extend merge_transform_ranges to accept collision_delete_ranges as a
  third parameter, performing a single-pass 3-way merge
- Remove duplicate sorted-merge code from transform_keys and
  transform_dictionary_keys
- Preserve zero-copy Cow::Borrowed fast path when no collision deletes
  are present
- Add missing FieldExt trait import in upsert_tests module so the
  simultaneous rename+delete collision tests compile
- Add parent_id column to 4 pre-existing tests that broke after
  enforcing parent_id as required per OTAP spec:
  - test_transform_attrs_keys_dict_encoded
  - test_transform_attrs_u16_keys
  - test_with_stats_utf8_rename_and_delete
  - test_with_stats_dict_rename_and_delete

All 291 transform tests and 22 attributes_processor tests pass.
Comment on lines +2001 to +2005
let old_key_mask = eq(key_col, &StringArray::new_scalar(old_key)).map_err(|e| {
Error::UnexpectedRecordBatchState {
reason: format!("eq kernel failed for old_key: {e}"),
}
})?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still quite a performance regression from what is on main. For example:

transform_attributes_native_keys/block_replace_no_delete/rows=1536
                        time:   [5.9484 µs 5.9712 µs 5.9930 µs]
                        change: [+241.53% +256.29% +269.79%] (p = 0.00 < 0.05)
                        Performance has regressed.

When I profile this, I see we're spending a lot of time in the eq compute kernel:
image

I think we need to optimize how we check for the presence of the existing keys.

We actually have a highly optimized kernel for checking if the keys match some given value, which I think is what we should use here:

// find the contiguous ranges in the values buffer that match the targets in the byte buffer
fn find_matching_key_ranges(
array_len: usize,
values_buf: &Buffer,
offsets: &OffsetBuffer<i32>,
target_bytes: &[Vec<u8>],
range_type: KeyTransformRangeType,
) -> Result<KeyTransformTargetRanges> {
let mut ranges = Vec::new();
let mut total_matches = 0;
let mut counts = vec![0; target_bytes.len()];
// we're going to access the raw offsets pointer directly while doing this range computation
// (see comments below for reasoning), so this check is for safety
if offsets.len() < array_len + 1 {
return Err(Error::UnexpectedRecordBatchState {
reason: "StringArray offsets has unexpected length".into(),
});
}
let offset_ptr = offsets.as_ptr();
for target_idx in 0..target_bytes.len() {
let target_bytes = &target_bytes[target_idx];
let count = counts
.get_mut(target_idx)
.expect("counts should be initialized");
let mut eq_range_start = None;
let target_len = target_bytes.len();
for i in 0..array_len {
// accessing the offsets using the pointer here is much faster than indexing the offsets
// buffer as offsets[i], because we skip doing the bounds check on each iteration.
// Safety: we've already checked that offsets.len() >= len + 1
#[allow(unsafe_code)]
let val_start = unsafe { *offset_ptr.add(i) } as usize;
#[allow(unsafe_code)]
let val_end = unsafe { *offset_ptr.add(i + 1) } as usize;
if val_end - val_start == target_len {
let value = &values_buf[val_start..val_end];
if value == target_bytes {
total_matches += 1;
*count += 1;
if eq_range_start.is_none() {
eq_range_start = Some(i);
}
continue;
}
}
// if we're here, we've found a non matching value
if let Some(s) = eq_range_start.take() {
// close current range
ranges.push(KeyTransformRange {
range: Range { start: s, end: i },
idx: target_idx,
range_type,
});
}
}
// add the final trailing range
if let Some(s) = eq_range_start {
ranges.push(KeyTransformRange {
range: Range {
start: s,
end: array_len,
},
idx: target_idx,
range_type,
});
}
}
// Sort the ranges to replace by start_index (first element in contained tuple)
ranges.sort_unstable_by_key(|r| r.start());
Ok(KeyTransformTargetRanges {
ranges,
counts,
total_matches,
})
}

One caveat about this is that, it only works on the offset/values buffer from the arrow string arrays. That means if the keys column happens to be dictionary encoded, we can only run this on the dictionary values.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, because we're checking for the existence of the old keys both in this method, and in plan_key_replacements:
https://github.com/open-telemetry/otel-arrow/blob/main/rust/otap-dataflow/crates/pdata/src/otap/transform.rs#L2282

It'd be nice if we can avoid checking that twice, but we'd need to dramatically refactor how this function is called in order to do that (which we may want to do).

If that refactoring is not possible, we could consider the fact that it may be somewhat rare that someone may have existing keys that would become duplicates via renaming. Given this fact, imo it might be better logic here to first check for the existing of the new key, and if no rows are found, we exist early (instead of how we're currently doing it, where we look for old keys first).

Comment on lines +2007 to +2016
for (start, end) in BitSliceIterator::new(old_key_mask.values().inner(), 0, num_rows) {
for i in start..end {
let pid: u64 = parent_ids.value(i).into();
source_parents.insert(pid as u32);
}
}

if source_parents.is_empty() {
continue;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful about the unnecessary work here - we don't actually need to load all the IDs into the ID bitmap before returning early. We may be able to use the result of having checked for the existence of the key to determine if we can continue early on this iteration of the loop.

Comment on lines +1905 to +1910
let mask = eq(key_col, &scalar).map_err(|e| Error::UnexpectedRecordBatchState {
reason: format!("eq kernel failed on attribute keys: {e}"),
})?;
if mask.true_count() > 0 {
return Ok(true);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the comment I've made on the code below, the eq compute kernel is kind of expensive, and we might be able to use the more optimized

// find the contiguous ranges in the values buffer that match the targets in the byte buffer
fn find_matching_key_ranges(
array_len: usize,
values_buf: &Buffer,
offsets: &OffsetBuffer<i32>,
target_bytes: &[Vec<u8>],
range_type: KeyTransformRangeType,
) -> Result<KeyTransformTargetRanges> {

That said, I also feel that the first step of find_rename_collisions_to_delete_ranges maybe should be be to call this for the new keys (again, see my comments on code below), so given that fact - we might want to be careful about doing duplicate work.

Copy link
Copy Markdown
Member

@albertlockett albertlockett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @gyanranjanpanda - sorry it took me some time to review the last round of changes. The code looks like it's in much better shape, thanks for all your work!

The performance is still not quite where we need it to be. I left some suggestions about how things could maybe be improved - specifically around how we're using the eq kernel.

I also noticed that these changes break the existing benchmarks, which would make this perf regression hard for you to measure locally. I've pushed a fix to my branch here: 32a55e5 (which you may actually want to cherry-pick).

FWIW, instructions for profiling have also beend added here https://github.com/open-telemetry/otel-arrow/blob/main/rust/otap-dataflow/PROFILING.md . It's possible to use these same commands while running the benchmarks.

gyanranjanpanda added a commit to gyanranjanpanda/otel-arrow that referenced this pull request Apr 24, 2026
Addresses mentor review feedback on PR open-telemetry#2423:

1. Replace expensive arrow eq() compute kernel in
   find_rename_collisions_to_delete_ranges with direct offset/values
   buffer comparison (matching the optimized kernel pattern used
   by find_matching_key_ranges). This eliminates the kernel dispatch
   overhead that was causing 6000%+ latency regressions in benchmarks.

2. Reorder collision logic to check new_key (target) first. Since
   collisions are rare (the renamed target key rarely already exists),
   this provides an early exit in the common case before any IdBitmap
   work is done.

3. Defer IdBitmap population until after confirming both old_key and
   new_key exist, avoiding unnecessary bitmap allocations and clears.

4. Rewrite rename_has_target_key_in_column to use the same optimized
   raw buffer scan instead of the eq kernel.

5. Add parent_id column to generate_native_keys_attr_batch in benchmarks
   (cherry-pick from mentor's commit 32a55e5) to fix benchmark failures
   with the collision detection code that now requires parent_id.

Also adds extract_dict_string_values and key_bytes_exist_in_buffer
helper functions that handle both native StringArray and dictionary-
encoded key columns.
gyanranjanpanda added a commit to gyanranjanpanda/otel-arrow that referenced this pull request Apr 24, 2026
Addresses mentor review feedback on PR open-telemetry#2423:

1. Replace expensive arrow eq() compute kernel in
   find_rename_collisions_to_delete_ranges with direct offset/values
   buffer comparison (matching the optimized kernel pattern used
   by find_matching_key_ranges). This eliminates the kernel dispatch
   overhead that was causing 6000%+ latency regressions in benchmarks.

2. Reorder collision logic to check new_key (target) first. Since
   collisions are rare (the renamed target key rarely already exists),
   this provides an early exit in the common case before any IdBitmap
   work is done.

3. Defer IdBitmap population until after confirming both old_key and
   new_key exist, avoiding unnecessary bitmap allocations and clears.

4. Rewrite rename_has_target_key_in_column to use the same optimized
   raw buffer scan instead of the eq kernel.

5. Add parent_id column to generate_native_keys_attr_batch in benchmarks
   (cherry-pick from mentor's commit 32a55e5) to fix benchmark failures
   with the collision detection code that now requires parent_id.

Also adds extract_dict_string_values and key_bytes_exist_in_buffer
helper functions that handle both native StringArray and dictionary-
encoded key columns.
@gyanranjanpanda gyanranjanpanda force-pushed the fix-duplicate-attributes-1650 branch from 47b7ec4 to 0762cc5 Compare April 24, 2026 04:18
Addresses mentor review feedback on PR open-telemetry#2423:

1. Replace expensive arrow eq() compute kernel in
   find_rename_collisions_to_delete_ranges with direct offset/values
   buffer comparison (matching the optimized kernel pattern used
   by find_matching_key_ranges). This eliminates the kernel dispatch
   overhead that was causing 6000%+ latency regressions in benchmarks.

2. Reorder collision logic to check new_key (target) first. Since
   collisions are rare (the renamed target key rarely already exists),
   this provides an early exit in the common case before any IdBitmap
   work is done.

3. Defer IdBitmap population until after confirming both old_key and
   new_key exist, avoiding unnecessary bitmap allocations and clears.

4. Rewrite rename_has_target_key_in_column to use the same optimized
   raw buffer scan instead of the eq kernel.

5. Add parent_id column to generate_native_keys_attr_batch in benchmarks
   (cherry-pick from mentor's commit 32a55e5) to fix benchmark failures
   with the collision detection code that now requires parent_id.

Also adds extract_dict_string_values and key_bytes_exist_in_buffer
helper functions that handle both native StringArray and dictionary-
encoded key columns.
@gyanranjanpanda gyanranjanpanda force-pushed the fix-duplicate-attributes-1650 branch from 0762cc5 to 1fb1c23 Compare April 24, 2026 05:04
@gyanranjanpanda
Copy link
Copy Markdown
Contributor Author

@albertlockett could u rview this now

Copy link
Copy Markdown
Member

@albertlockett albertlockett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the latest round of changes @gyanranjanpanda !

Still some performance issues with this code that I feel we should address

Comment on lines +2096 to +2118
let dict_keys: Vec<usize> = match key_col.data_type() {
DataType::Dictionary(k, _) => match k.as_ref() {
DataType::UInt8 => key_col
.as_any()
.downcast_ref::<DictionaryArray<UInt8Type>>()
.expect("checked type")
.keys()
.values()
.iter()
.map(|v| *v as usize)
.collect(),
DataType::UInt16 => key_col
.as_any()
.downcast_ref::<DictionaryArray<UInt16Type>>()
.expect("checked type")
.keys()
.values()
.iter()
.map(|v| *v as usize)
.collect(),
_ => unreachable!("unsupported dict key type"),
},
_ => unreachable!("checked dictionary type"),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're eagerly collecting the dictionary keys into a Vec<usize>, and later on we're actually just iterating over the vec. Doing this collection seems wasteful.

I noticed we still have a performance regression in one of the existing benchmarks:

transform_attributes_dict_keys/single_replace_no_deletes/keys=128,rows=8192,rows_per_key=64
                        time:   [1.4321 µs 1.4362 µs 1.4401 µs]
                        change: [+172.60% +174.04% +175.64%] (p = 0.00 < 0.05)
                        Performance has regressed.

And indeed, we're spending a lot of time in collect:
image

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think we can avoid materializing the vec, and just take the array. See my comment on the code below.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we have variants of crate::error::Error that can be used for invalid dictionary types instead of using unreachable! here. I wonder if we should either use those, or comment on why the code is actually unreachable

Comment on lines +2167 to +2174
for dict_val_idx in range.start()..range.end() {
for (row, dk) in dict_keys.iter().enumerate() {
if *dk == dict_val_idx {
let pid: u64 = parent_ids.value(row).into();
source_parents.insert(pid as u32);
}
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each value in the range, we iterate the entire dictionary keys array and check if the index from the range is equal to the key? If this range has a size of greater than one, this is not a very efficient way to do this check.

I actually think if you just took the dictionary keys as an arrow array (e.g. avoid materializing the Vec, as mentioned above). Then here it would be faster to do something like:

let row_mask = if range.len() == 1 {
     eq(dict_keys, UInt16Array::new_scalar(range.start() as u16))?
} else {
    let geq_start = gt_eq(dict_keys, UInt16Array::new_scalar(range.start() as u16))?;
    let lt_end = lt(dict_keys, UInt16Array::new_scalar(range.end() as u16))?;
    and(&geq_start, &lt_end)
}

let row_mask_buffer = row_mask.values();
for (start, end) in BitSliceIter::new(row_mask_buffer.inner(), row_mask_buffer.offset(), row_mask.len()) {
    for i in start..end {
         let pid: u64 = parent_ids.value(i).into();
         source_parents.insert(pid as u32);
    }
}

see: https://docs.rs/arrow/latest/arrow/compute/kernels/cmp/index.html
see: https://docs.rs/arrow-buffer/latest/arrow_buffer/bit_iterator/struct.BitSliceIterator.html

/// dictionary-encoded attribute keys. Verifies that collision removal and real
/// deletes interact correctly through the dictionary key transform path.
#[test]
fn test_rename_collision_with_real_delete_dict() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add an additional test where the dict key is for the key column is u16 as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

query-engine Query Engine / Transform related tasks query-engine-columnar Columnar query engine which uses DataFusion to process OTAP Batches rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants