Skip to content

Fix compile break from BloomFilter.create deprecation#14468

Merged
mythrocks merged 1 commit intoNVIDIA:mainfrom
mythrocks:neutral-fix-bloom
Mar 25, 2026
Merged

Fix compile break from BloomFilter.create deprecation#14468
mythrocks merged 1 commit intoNVIDIA:mainfrom
mythrocks:neutral-fix-bloom

Conversation

@mythrocks
Copy link
Copy Markdown
Collaborator

@mythrocks mythrocks commented Mar 25, 2026

Fixes #14462.

Description

This change addresses the build breakage in spark-rapids from the deprecation of spark-rapids-jni BloomFilter.create(int,int) deprecation, introduced in NVIDIA/spark-rapids-jni#4360.

This is a stop-gap solution that only restores prior behaviour, i.e. support for the BloomFilter v1 binary format.

Actual support for the BloomFilter v2 format will follow in #14406.

Checklists

  • This PR has added documentation for new or modified features or behaviors.
  • This PR has added new tests or modified existing tests to cover new code paths.
    (Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.)
  • Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description.

Fixes NVIDIA#14462.

This change addresses the build breakage in `spark-rapids` from the deprecation
of `spark-rapids-jni` `BloomFilter.create(int,int)` deprecation, introduced
in NVIDIA/spark-rapids-jni#4360.

This is a stop-gap solution that only restores prior behaviour, i.e. support
for the BloomFilter v1 binary format.

Actual support for the BloomFilter v2 format will follow in NVIDIA#14406.

Signed-off-by: MithunR <mithunr@nvidia.com>
@mythrocks mythrocks self-assigned this Mar 25, 2026
@mythrocks mythrocks closed this Mar 25, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 25, 2026

Greptile Summary

This PR is a minimal stop-gap fix that resolves a compile break introduced when spark-rapids-jni deprecated the 2-argument BloomFilter.create(int, int) overload. The single change expands the call-site to the new 4-argument form BloomFilter.create(BloomFilter.VERSION_1, numHashes, numBits, BloomFilter.DEFAULT_SEED), explicitly pinning v1 behaviour and the default seed to preserve backward compatibility.

  • Only file changed: GpuBloomFilterAggregate.scala — one call-site inside GpuBloomFilterUpdate.reductionAggregate.
  • Intentional scope-limiting: A // TODO comment is included pointing to PR BloomFilter v2 support [databricks] #14406 for proper v2 support; v2 is deliberately out of scope here.
  • initialValues vs defaultValues: GpuBloomFilterAggregate overrides initialValues on line 80. The custom rule recommends using defaultValues (returning Array[GpuScalar]) for GPU UDAFs where initialValues would throw. However, this class extends the GpuAggregateFunction trait, which defines initialValues as a valid abstract val (not throwing), and there is currently no defaultValues API in the codebase. No immediate action is required, but this should be revisited when the defaultValues API lands.
  • No test additions: Consistent with the stated goal of restoring prior behaviour only; the existing bloom-filter tests cover the v1 path.

Confidence Score: 5/5

  • Safe to merge; the change is a minimal, targeted fix that restores prior v1 bloom-filter behaviour and is explicitly scoped as a stop-gap.
  • The diff is 3 effective lines: a comment and expanding one function call to its non-deprecated 4-argument form with semantically equivalent arguments (VERSION_1, DEFAULT_SEED). The author documents the intent clearly via a TODO comment. No logic changes, no new code paths, and pre-existing tests cover v1 behaviour. The custom initialValues rule does not currently apply because the GpuAggregateFunction trait defines it as a non-throwing abstract val and no defaultValues API exists yet.
  • No files require special attention.

Important Files Changed

Filename Overview
sql-plugin/src/main/spark330/scala/org/apache/spark/sql/rapids/aggregate/GpuBloomFilterAggregate.scala Replaces deprecated 2-arg BloomFilter.create with the 4-arg API (VERSION_1, numHashes, numBits, DEFAULT_SEED) to fix the compile break; pre-existing initialValues override is in scope for the custom rule but no defaultValues API exists in the codebase yet.

Sequence Diagram

sequenceDiagram
    participant Spark as Spark Aggregate Exec
    participant Update as GpuBloomFilterUpdate
    participant JNI as spark-rapids-jni BloomFilter
    participant Merge as GpuBloomFilterMerge

    Spark->>Update: reductionAggregate(col: ColumnVector)
    Update->>JNI: BloomFilter.create(VERSION_1, numHashes, numBits, DEFAULT_SEED)
    JNI-->>Update: bloomFilter (Scalar)
    Update->>JNI: BloomFilter.put(bloomFilter, col)
    JNI-->>Update: updated bloomFilter
    Update-->>Spark: bloomFilter Scalar (binary)

    Spark->>Merge: reductionAggregate(col: ColumnVector)
    alt all nulls
        Merge-->>Spark: Scalar.listFromNull(UINT8)
    else some nulls
        Merge->>Merge: filter nulls via Table
        Merge->>JNI: BloomFilter.merge(filtered column)
        JNI-->>Merge: merged bloomFilter
        Merge-->>Spark: merged bloomFilter
    else no nulls
        Merge->>JNI: BloomFilter.merge(col)
        JNI-->>Merge: merged bloomFilter
        Merge-->>Spark: merged bloomFilter
    end
Loading

Reviews (2): Last reviewed commit: "Fix compile break from BloomFilter.creat..." | Re-trigger Greptile

@mythrocks mythrocks reopened this Mar 25, 2026
@abellina
Copy link
Copy Markdown
Collaborator

build

@mythrocks mythrocks merged commit 768fc79 into NVIDIA:main Mar 25, 2026
55 of 63 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Scala compilation fails: BloomFilter.create() deprecated method treated as fatal error in GpuBloomFilterAggregate

4 participants