Decouple from `cudf::detail::make_counting_transform_iterator` by mythrocks · Pull Request #4306 · NVIDIA/spark-rapids-jni

mythrocks · 2026-02-24T00:26:26Z

This commit introduces utility iterators to be used in place cudf::detail iterators. This is to further reduce dependencies on cudf::detail APIs that are now deemed private to the CUDF project.

`make_counting_transform_iterator`

This change introduces a version of make_counting_transform_iterator that is specific to Spark RAPIDS JNI.

The previous version of this function is from cudf::detail, which is now deemed private to cuDF. This commit should allow Spark RAPIDS JNI to be insulated from changes to interfaces in cudf::detail.

Note that this version does not use thrust::transform_iterator. It banks instead on cuda::make_transform_iterator instead.

This change introduces a version of `make_counting_transform_iterator` that is specific to Spark RAPIDS JNI. The previous version of this function is from `cudf::detail`, which is now deemed private to cuDF. This commit should allow Spark RAPIDS JNI to be insulated from changes to interfaces in `cudf::detail`. Note that this version does not use `thrust::transform_iterator`. It banks instead on `cuda::make_transform_iterator` instead. Signed-off-by: MithunR <mithunr@nvidia.com>

greptile-apps · 2026-02-24T00:28:43Z

Greptile Summary

This PR achieves two goals: it decouples Spark RAPIDS JNI from cudf::detail::make_counting_transform_iterator by introducing a local replacement (spark_rapids_jni::util::make_counting_transform_iterator in utilities/iterator.cuh), and simultaneously extends the Bloom filter implementation to support both Spark's V1 and V2 formats, including a configurable seed for V2.

Key changes:

utilities/iterator.cuh — New thin wrapper that uses cuda::counting_iterator + thrust::make_transform_iterator instead of thrust::counting_iterator. Replaces cudf::detail::make_counting_transform_iterator in all call sites across row_conversion.cu, shuffle_split.cu, shuffle_assemble.cu, hyper_log_log_plus_plus*.cu, from_json_to_structs.cu, get_json_object.cu, and the corresponding tests.
Bloom filter V2 support — Adds bloom_filter_header_v1 (12 bytes) and bloom_filter_header_v2 (16 bytes) structs with static_assert size checks. V2 uses a 64-bit combined hash starting value (h1 * INT32_MAX) and supports a hash seed, matching BloomFilterImplV2.java. Both versions share a common gpu_bit_to_word_mask helper that preserves the big-endian byte-swizzle for Spark compatibility.
Correctness improvements — unpack_bloom_filter now takes cudf::device_span<uint8_t const> (removing a previous const_cast), cuda::atomic_ref::fetch_or replaces raw atomicOr, and rmm::exec_policy_nosync is used where the stream is already tracked.
Java/JNI — BloomFilter.java gains VERSION_1, VERSION_2, DEFAULT_SEED constants and a new create(version, numHashes, bloomFilterBits, seed) method; the old two-arg create is deprecated but retained for backward compatibility.
Note: The PR description mentions make_pair_iterator as a second deliverable, but this function is not present in the diff. Only make_counting_transform_iterator is introduced.

Confidence Score: 5/5

Safe to merge; all remaining findings are P2 style/consistency suggestions with no correctness or data-integrity impact.
The iterator replacement is mechanically correct and well-tested. The V2 bloom filter implementation matches the Spark reference algorithm faithfully (64-bit hash, seed, same bit-swizzle). Atomics were upgraded to cuda::atomic_ref, const-correctness was improved, and both Java and C++ APIs are backward compatible. Three P2 issues remain: a signed-unsigned comparison without explicit cast in bloom_filter_probe, a JNI upper-bound that is slightly looser than the C++ guard, and a minor static-assert restriction in the new iterator header that excludes uint32_t start values. None of these cause incorrect behavior at any input currently reachable in production.
src/main/cpp/src/bloom_filter.cu (signed-unsigned cast on line ~696), src/main/cpp/src/BloomFilterJni.cpp (JNI bound vs C++ bound mismatch), src/main/cpp/src/utilities/iterator.cuh (static_assert excludes uint32_t)

Important Files Changed

Filename	Overview
src/main/cpp/src/utilities/iterator.cuh	New utility header introducing `spark_rapids_jni::util::make_counting_transform_iterator`, replacing `cudf::detail::make_counting_transform_iterator` across the codebase. Uses `cuda::counting_iterator` + `thrust::make_transform_iterator`. Static-assert restricts `CountingIterType` to ≤ 32-bit signed width (≤ `int32_t` digits), which excludes `uint32_t` — consistent with the cudf convention it replaces.
src/main/cpp/src/bloom_filter.cu	Major refactor adding V2 bloom filter support alongside V1. Introduces version-templated `gpu_bloom_filter_put` and `bloom_probe_functor`, 64-bit hash indexing for V2, configurable seeds, `cuda::atomic_ref::fetch_or` for atomics, and `rmm::exec_policy_nosync`. Minor inconsistency: `bloom_filter.size()` (int32_t) compared to `size_t` without explicit cast in `bloom_filter_probe` (unlike the analogous checks in `bloom_filter_put` and `bloom_filter_merge`).
src/main/cpp/src/bloom_filter.hpp	Adds explicit V1/V2 header structs with `static_assert` size checks, version constants, a `bloom_filter_header_size_for_version` helper, and updates `bloom_filter_create` signature to include `version` and `seed`.
src/main/cpp/src/BloomFilterJni.cpp	JNI bridge updated to pass `version` and `seed` to the C++ API. Adds input validation via `JNI_ARG_CHECK` capping `bloomFilterBits` at `INT32_MAX * 64`; however, `get_bloom_filter_stride` in C++ caps `buf_size` at `INT32_MAX`, making the effective maximum `bloom_filter_longs` ~268 M (not `INT32_MAX`). Inputs above this limit will throw a `CUDF_EXPECTS` rather than the JNI argument-check error.
src/main/java/com/nvidia/spark/rapids/jni/BloomFilter.java	Public API updated with `VERSION_1`, `VERSION_2`, `DEFAULT_SEED` constants and a new `create(version, numHashes, bloomFilterBits, seed)` overload. Old `create(numHashes, bloomFilterBits)` is deprecated but kept for backward compatibility, delegating to V1/seed-0.
src/main/cpp/src/row_conversion.cu	Mechanical replacement of `cudf::detail::make_counting_transform_iterator` → `spark_rapids_jni::util::make_counting_transform_iterator`, and `util::` arithmetic helpers → `cudf::util::` (adding explicit namespace). No logic changes.
src/main/cpp/tests/bloom_filter.cu	Existing V1 tests renamed with `V1` suffix and updated for new API signatures. New V2 tests added covering initialization, build-and-probe, null handling, merge, absent-key probing, and seed variation.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["bloom_filter_create(version, num_hashes, longs, seed)"] --> B{version?}
    B -->|V1| C["pack bloom_filter_header_v1\n(12 bytes: version, num_hashes, num_longs)"]
    B -->|V2| D["pack bloom_filter_header_v2\n(16 bytes: version, num_hashes, seed, num_longs)"]
    C --> E["memset bit array to 0"]
    D --> E

    E --> F["bloom_filter_put(filter, input)"]
    F --> G["unpack_bloom_filter()\nreturns header, buffer, bits, seed"]
    G --> H{version?}
    H -->|V1| I["gpu_bloom_filter_put<1>\nh1*1 + idx*h2 mod bits\nseed=0, loop 1..num_hashes"]
    H -->|V2| J["gpu_bloom_filter_put<2>\nh1*INT32_MAX + h2 (64-bit)\nseed from header, loop 0..num_hashes-1"]
    I --> K["cuda::atomic_ref::fetch_or\ngpu_bit_to_word_mask with big-endian swizzle"]
    J --> K

    K --> L["bloom_filter_probe(input, filter)"]
    L --> M{version?}
    M -->|V1| N["bloom_probe_functor<1>\n32-bit hash, early exit on miss"]
    M -->|V2| O["bloom_probe_functor<2>\n64-bit hash, early exit on miss"]
    N --> P["Output: bool column\n(true = possibly in set)"]
    O --> P

    L2["bloom_filter_merge(filters)"] --> G2["unpack first filter\nvalidate all filters match header"]
    G2 --> R["bitwise-OR all bit arrays\n(thrust::transform)"]
    R --> P2["New merged filter scalar"]

_{Reviews (6): Last reviewed commit: "Merge remote-tracking branch 'origin/rel..." | Re-trigger Greptile}

greptile-apps

_{11 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

mythrocks · 2026-02-24T18:31:41Z

Build

Signed-off-by: MithunR <mithunr@nvidia.com>

…erator

Signed-off-by: MithunR <mithunr@nvidia.com>

greptile-apps

_{13 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

src/main/cpp/src/utilities/iterator.cuh

greptile-apps

_{13 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

src/main/cpp/src/utilities/iterator.cuh

Signed-off-by: MithunR <mithunr@nvidia.com>

greptile-apps

_{13 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

mythrocks · 2026-02-25T00:05:27Z

Build

…erator

greptile-apps

_{13 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

mythrocks · 2026-02-25T18:32:49Z

Build

ttnghia

Please hold off a little bit. We need to discuss on mitigating the issue with code duplicates and unavoidable dependency from cudf detail namespace.

mythrocks · 2026-03-04T19:46:26Z

I think make_counting_transform_iterator is alright to have our own version of.

I think the pair-wise iterator should probably remain in CUDF. I'll check whether libcudf is accepting of exposing those utilities.

src/main/cpp/src/utilities/iterator.cuh

nvauto · 2026-03-16T02:04:00Z

NOTE: release/26.04 has been created from main. Please retarget your PR to release/26.04 if it should be included in the release.

These are a bridge too far. Pair-wise iterators can be consumed from cudf, eventually. Signed-off-by: MithunR <mithunr@nvidia.com>

…nsform-iterator

mythrocks · 2026-03-30T22:53:38Z

Build

mythrocks · 2026-03-30T22:54:47Z

This PR has been reduced in scope, to remove the pair-wise iterators changes. Those can be consumed from rapidsai/cudf, once they're made available for public consumption.

mythrocks · 2026-03-31T17:49:08Z

@ttnghia: Any objection to my merging this change?

ttnghia · 2026-03-31T18:30:27Z

@ttnghia: Any objection to my merging this change?

Now with AI assisted coding, I'm less likely reject code duplication although that is never a good way to me 😃

mythrocks · 2026-03-31T20:27:55Z

Thank you, chaps. This change has been merged.

On to the next.

mythrocks self-assigned this Feb 24, 2026

mythrocks changed the title ~~Decouple from cudf::detail::make_counting_transform_iterator~~ [WIP] Decouple from cudf::detail::make_counting_transform_iterator Feb 24, 2026

mythrocks added the tech debt label Feb 24, 2026

greptile-apps bot reviewed Feb 24, 2026

View reviewed changes

mythrocks changed the title ~~[WIP] Decouple from cudf::detail::make_counting_transform_iterator~~ Decouple from cudf::detail::make_counting_transform_iterator Feb 24, 2026

mythrocks added 3 commits February 24, 2026 14:38

Also cloned pairwise iterators.

3d1b068

Signed-off-by: MithunR <mithunr@nvidia.com>

Merge remote-tracking branch 'origin/main' into counting-transform-it…

f552d6b

…erator

Fixed formatting.

d39332e

Signed-off-by: MithunR <mithunr@nvidia.com>

greptile-apps bot reviewed Feb 24, 2026

View reviewed changes

src/main/cpp/src/utilities/iterator.cuh Outdated Show resolved Hide resolved

greptile-apps bot reviewed Feb 24, 2026

View reviewed changes

src/main/cpp/src/utilities/iterator.cuh Outdated Show resolved Hide resolved

Docs for the exception on null-check.

f8fdf19

Signed-off-by: MithunR <mithunr@nvidia.com>

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into counting-transform-it…

e10722a

…erator

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

sameerz requested a review from a team February 28, 2026 00:15

pmattione-nvidia previously approved these changes Mar 2, 2026

View reviewed changes

ttnghia requested changes Mar 2, 2026

View reviewed changes

mythrocks commented Mar 4, 2026

View reviewed changes

src/main/cpp/src/utilities/iterator.cuh Outdated Show resolved Hide resolved

mythrocks mentioned this pull request Mar 7, 2026

[FEA] Expose the utilities from include/cudf/detail/iterators.cuh in the public API rapidsai/cudf#21701

Open

mythrocks added 2 commits March 30, 2026 15:19

Reverted changes for pair-wise iterators.

399fd39

These are a bridge too far. Pair-wise iterators can be consumed from cudf, eventually. Signed-off-by: MithunR <mithunr@nvidia.com>

Merge remote-tracking branch 'origin/release/26.04' into counting-tra…

e783140

…nsform-iterator

mythrocks dismissed pmattione-nvidia’s stale review via e783140 March 30, 2026 22:47

mythrocks changed the base branch from main to release/26.04 March 30, 2026 22:52

mythrocks requested review from pmattione-nvidia and ttnghia March 30, 2026 22:55

pmattione-nvidia approved these changes Mar 31, 2026

View reviewed changes

ttnghia approved these changes Mar 31, 2026

View reviewed changes

mythrocks merged commit 995224e into NVIDIA:release/26.04 Mar 31, 2026
2 checks passed

Conversation

mythrocks commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

make_counting_transform_iterator

Uh oh!

greptile-apps bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

mythrocks commented Feb 24, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

mythrocks commented Feb 25, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

mythrocks commented Feb 25, 2026

Uh oh!

ttnghia left a comment

Choose a reason for hiding this comment

Uh oh!

mythrocks commented Mar 4, 2026

Uh oh!

Uh oh!

nvauto commented Mar 16, 2026

Uh oh!

mythrocks commented Mar 30, 2026

Uh oh!

mythrocks commented Mar 30, 2026

Uh oh!

mythrocks commented Mar 31, 2026

Uh oh!

ttnghia commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

mythrocks commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mythrocks commented Feb 24, 2026 •

edited

Loading

`make_counting_transform_iterator`

greptile-apps bot commented Feb 24, 2026 •

edited

Loading

ttnghia commented Mar 31, 2026 •

edited

Loading