Skip to content

Replace cudf::detail::valid_if with cudf::bools_to_mask#4301

Open
mythrocks wants to merge 7 commits intoNVIDIA:release/26.04from
mythrocks:valid_if-to-bools_to_mask
Open

Replace cudf::detail::valid_if with cudf::bools_to_mask#4301
mythrocks wants to merge 7 commits intoNVIDIA:release/26.04from
mythrocks:valid_if-to-bools_to_mask

Conversation

@mythrocks
Copy link
Copy Markdown
Collaborator

This commit is part of the continuing effort to reduce the dependency of spark-rapids-jni on cudf::detail APIs. In this commit, some of the references to cudf::detail::valid_if with cudf::bools_to_mask.

The functionality should not be altered. Existing tests ought to cover the changes.

This commit is part of the continuihng effort to reduce the dependency of
spark-rapids-jni on `cudf::detail` APIs.  In this commit, some of the
references to `cudf::detail::valid_if` with `cudf::bools_to_mask`.

The functionality should not be altered.  Existing tests ought to cover
the changes.

Signed-off-by: MithunR <mithunr@nvidia.com>
@mythrocks mythrocks self-assigned this Feb 20, 2026
@mythrocks mythrocks marked this pull request as draft February 20, 2026 23:48
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Feb 20, 2026

Greptile Summary

This PR replaces cudf::detail::valid_if with the public cudf::bools_to_mask API across ~10 files and introduces a new internal spark_rapids_jni::util::make_counting_transform_iterator helper to avoid cudf::detail::iterator.cuh. It also bumps the cudf submodule to release/26.04. Beyond the stated refactor it adds a substantial new feature: Bloom filter V2 support matching Spark's BloomFilterImplV2 (16-byte header with seed, 64-bit hash indexing, configurable seed end-to-end through JNI/Java), with comprehensive V2 tests in both C++ and Java.

  • All bools_to_mask call sites correctly handle the return-type change from rmm::device_buffer to std::unique_ptr<rmm::device_buffer> via std::move(*ptr.release()).
  • The V2 hash algorithm faithfully mirrors the Spark reference, including the h1 * INT32_MAX initialisation and big-endian buffer swizzle.
  • unpack_bloom_filter safely reads a variable-width header and validates both version and header size before accessing version-specific fields.
  • One P2 nit: create_random_null_mask in benchmark code uses thrust::device (implicit stream) for tabulate while the rest of the file uses explicit stream-tied policies.

Confidence Score: 5/5

Safe to merge; only remaining finding is a P2 stream-policy nit in benchmark code.

The bools_to_mask migration is semantically correct at all call sites and return-type changes are handled properly. The V2 bloom filter feature matches the Spark reference and is well-tested. The single open finding is a stream inconsistency in non-production benchmark code.

src/main/cpp/benchmarks/common/generate_input.cu — stream policy inconsistency in create_random_null_mask.

Important Files Changed

Filename Overview
src/main/cpp/src/bloom_filter.cu Adds V2 bloom filter support (64-bit hash indexing, seed, new 16-byte header), replaces atomicOr with cuda::atomic_ref, dispatches correct kernel/functor at runtime.
src/main/cpp/src/bloom_filter.hpp Adds versioned header structs, version constants, and bloom_filter_create signature extended with version and seed.
src/main/cpp/src/BloomFilterJni.cpp JNI entry updated for version/seed params; adds explicit bloomFilterBits bounds check matching Spark BitArray constraint.
src/main/java/com/nvidia/spark/rapids/jni/BloomFilter.java Old two-argument create deprecated; new create(version,numHashes,bloomFilterBits,seed) added; native signature updated.
src/main/cpp/src/utilities/iterator.cuh New utility replacing cudf::detail::make_counting_transform_iterator with cuda::counting_iterator-based equivalent.
src/main/cpp/benchmarks/common/generate_input.cu All valid_if calls replaced with bools_to_mask; create_random_null_mask uses thrust::device inconsistently with the rest of the file.
src/main/cpp/src/from_json_to_raw_map.cu valid_if replaced with thrust::transform + bools_to_mask; logical-not inversion of should_be_nullified preserved correctly.
src/main/cpp/src/get_json_object.cu valid_if replaced with bools_to_mask; counting_transform_iterator switched to internal util.
src/main/cpp/src/row_conversion.cu Removes cudf::detail::iterator.cuh dependency; util:: namespace calls fully qualified as cudf::util::.
src/main/cpp/tests/bloom_filter.cu All V1 tests ported with explicit version constant; comprehensive V2 tests added.

Reviews (6): Last reviewed commit: "Merge remote-tracking branch 'origin/rel..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@mythrocks
Copy link
Copy Markdown
Collaborator Author

Build

This change is more controversial.  The only way to get away from using
`cudf::detail::valid_if` in the files modified here is to materialize
a temporary bool vector (that is then packed).

Signed-off-by: MithunR <mithunr@nvidia.com>
@mythrocks
Copy link
Copy Markdown
Collaborator Author

Build

@mythrocks
Copy link
Copy Markdown
Collaborator Author

c3f2550 is slightly controversial; the only way to stop using cudf::detail::valid_if is to materialize the boolean vector before packing it down to a null mask.

There might be value in requesting for a cudf::valid_if for this case, if the performance hit is too steep.

@mythrocks mythrocks marked this pull request as ready for review February 24, 2026 18:30
@mythrocks mythrocks changed the title [WIP] Replace cudf::detail::valid_if with cudf::bools_to_mask Replace cudf::detail::valid_if with cudf::bools_to_mask Feb 24, 2026
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Signed-off-by: MithunR <mithunr@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@mythrocks
Copy link
Copy Markdown
Collaborator Author

Build

@mythrocks
Copy link
Copy Markdown
Collaborator Author

Build

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@sameerz sameerz requested a review from a team February 28, 2026 00:15
Copy link
Copy Markdown
Collaborator

@ttnghia ttnghia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please hold off a little bit. We need to discuss on mitigating the issue with code duplicates and unavoidable dependency from cudf detail namespace. We should also avoid performance impact by doing this.

@nvauto
Copy link
Copy Markdown
Collaborator

nvauto commented Mar 16, 2026

NOTE: release/26.04 has been created from main. Please retarget your PR to release/26.04 if it should be included in the release.

@mythrocks mythrocks changed the base branch from main to release/26.04 March 31, 2026 20:58
@mythrocks mythrocks dismissed pmattione-nvidia’s stale review March 31, 2026 20:58

The base branch was changed.

@mythrocks
Copy link
Copy Markdown
Collaborator Author

Build

auto [null_mask, null_count] = cudf::detail::valid_if(
valid_it, valid_it + should_be_nullified->size(), thrust::logical_not<bool>{}, stream, mr);
return {null_count > 0 ? std::move(null_mask) : rmm::device_buffer{0, stream, mr}, null_count};
rmm::device_uvector<bool> valids(should_be_nullified->size(), stream);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we directly write to something bit-packed instead, to avoid the extra work from calling bools_to_mask?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what the valid_if kernel is doing. You are suggesting to reimplementing the valid_if kernel 😄

auto [null_mask, null_count] = cudf::detail::valid_if(
valid_it, valid_it + should_be_nullified->size(), thrust::logical_not<bool>{}, stream, mr);
return {null_count > 0 ? std::move(null_mask) : rmm::device_buffer{0, stream, mr}, null_count};
rmm::device_uvector<bool> valids(should_be_nullified->size(), stream);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case we are no longer be able to use cudf::detail::valid_if, now I am OK to make a copy of valid_if inside spark-rapids-jni so we can just call it, similar to what we have done with make_counting_transform_iterator.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cudf continue to bar downstream libraries/applications from using its detail utilities, this will be the trend that we are unavoidable to follow, unfortunately. There are many more things that would be copied very soon.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK to make a copy of valid_if inside spark-rapids-jni...

If it's going to come to that, then I might punt this change out of 26.04.

I'm neither keen nor proud of having to replicate CUDF kernels here. But our hand might be forced. CUDF has made it quite clear that cudf::detail::valid_if() is not a candidate for exposure/consumption from the public API.

Making a custom copy might be best for efficiency, albeit at the cost of maintenance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants