Add Databricks-17.3 support [databricks] by nartal1 · Pull Request #14360 · NVIDIA/spark-rapids

nartal1 · 2026-03-03T22:56:17Z

Contributes to #14015

Description

This PR adds support for Databricks-17.3. This PR compiles DBR-17.3 without any build failures. We have a bunch of integration tests that are failing that are tracked in a different issue.
This PR:

Adds a new spark400db173 shim to support Databricks Runtime 17.3, which is based on Spark 4.0.0
Introduces build profile, shim dependencies, and version-specific source directories for DB-17.3

Key Changes

Build Infrastructure

New Maven profile release400db173 with spark.version=4.0.0-databricks-173
Updated Scala 2.13 enforcer regex to allow Databricks vendor builds (db suffix)
Made orc-format dependency conditional (DB-17.3 only) in the Databricks BOM — this artifact does not exist in earlier Databricks runtimes
Updated Jenkins CI scripts (build.sh, deploy.sh, install_deps.py, common_vars.sh) for DB-17.3 cluster support

Shuffle API Changes

DB-17.3 adds a prismMapStatusEnabled parameter to ShuffleManager.getReader() (8-param vs 7-param signature)
New GpuShuffleExchangeExec for DB-17.3 with additional metrics (skew, spill-fallback, adaptive repartitioning) and new repartition() / adaptiveRepartitioningStatus() methods
New RapidsShuffleReaderShim, and ShuffleManagerShims for the updated shuffle interfaces

Adaptive Query Execution

ShuffleQueryStageExec and BroadcastQueryStageExec constructors now require an implicit adaptiveContext parameter in DB-17.3

Expression and Subquery Changes

DynamicPruningExpression has a second parameter (dynamicPruningInfo) in DB-17.3
GpuScalarSubquery must implement resultUpdated() added to ExecSubqueryExpression
Expression shims updated for DB-17.3's nodePatternsInternal tree pattern matching (ShimGetArrayStructFields, ShimGetArrayItem, ShimGetStructField, GpuDeterministicFirstLastCollectShim)

Streaming and Package Reorganization

FileStreamSink moved to org.apache.spark.sql.execution.streaming.sinks
MetadataLogFileIndex moved to org.apache.spark.sql.execution.streaming.runtime

Build

Login to Databricks-17.3 ML cluster.
Checkout this branch(nartal1/databricks_173_support)
Run : ./jenkins/databricks/build.sh

Checklists

This PR has added documentation for new or modified features or behaviors.
This PR has added new tests or modified existing tests to cover new code paths.
(Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.)
Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description.

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

- Share TryModeShim.scala (evalContext.evalMode handling) - Share TimeAddShims.scala (TimeAdd->TimestampAddInterval rename) - Both files moved to spark400db173 with updated metadata to support both versions

- Add spark400db173 to RoundShims and SparkStringUtilsShims metadata - Fix ShowNamespacesExecShims API mismatch - Share AggregateInPandasExecShims between spark400db173 and spark411 - Share FileStreamSinkShims between spark400db173 and spark411 Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Refactor getReader to getReaderImpl and use ShuffleManagerShims to handle version-specific shuffle reader signatures. Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

…_173_support_backup

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

nartal1 · 2026-03-04T19:47:03Z

build

…_173_support

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

nartal1 · 2026-03-06T19:48:31Z

build

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

…_173_support

…ids into databricks_173_support

nartal1 · 2026-03-06T21:26:03Z

build

nartal1 · 2026-03-06T21:45:44Z

build

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuMultiFileReader.scala

...gin/src/main/spark400db173/scala/com/nvidia/spark/rapids/shims/QueryStageRowCountShims.scala

shim-deps/pom.xml

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

nartal1 · 2026-03-10T17:39:15Z

build

jihoonson

LGTM. Thanks @nartal1

gerashegalov · 2026-03-10T20:58:07Z

jenkins/databricks/deploy.sh

 SERVER_ID='snapshots'
 SERVER_URL="$URM_URL-local"
-SCALA_VERSION=`mvn help:evaluate -q -pl dist -Dexpression=scala.binary.version -DforceStdout`
+SCALA_VERSION=`mvn help:evaluate -q -f $POM_FILE -pl dist -Dexpression=scala.binary.version -DforceStdout`


Define a standard way of detecting versions and source this across scripts . Why is this different from build.sh? Can we get it from some build info file in the DBR image ?

Updated it similar to build.sh. I couldn't figure out this from any files in the DBR image.

gerashegalov · 2026-03-10T21:02:00Z

jenkins/databricks/install_deps.py

+       # Spark 3.x versions
+        deps += [Artifact('org.apache.hive', 'hive-metastore-client-patched',
+                    f'{spark_prefix}--patched-hive-with-glue--hive-*-patch-{spark_suffix}_deploy.jar')]


Make the indentation consistent at least in the code you are adding

gerashegalov · 2026-03-10T21:10:01Z

...gin/src/main/spark400db173/scala/com/nvidia/spark/rapids/shims/QueryStageRowCountShims.scala

+
+/**
+ * Databricks 17.3 version where getRuntimeStatistics has compile-time access restrictions.
+ * The method exists and is public at runtime, but compile-time metadata shows it as protected,


Sounds like Scala protected compiles to JVM public. Can we not then make it a Java class and avoid reflection?

Thanks for the pointer. Updated it.

gerashegalov · 2026-03-10T21:42:08Z

...ugin/src/main/spark400db173/scala/org/apache/spark/rapids/shims/GpuShuffleExchangeExec.scala

+  def repartition(numPartitions: Int, updatedRepartitioningStatus: AdaptiveRepartitioningStatus):
+     ShuffleExchangeLike = {
+    val newCpuPartitioning = cpuOutputPartitioning.withNewNumPartitions(numPartitions)
+    copy(gpuOutputPartitioning, child, shuffleOrigin)(newCpuPartitioning)


Do we need to pass updatedRepartitioningStatus around here somewhere?

Updated it to pass updatedReparitiongStatus. Earlier it was always the default value.

gerashegalov · 2026-03-10T21:46:34Z

tests/src/test/spark400db173/scala/com/nvidia/spark/rapids/MetricsEventLogValidationSuite.scala

+// Databricks 17.3: Use jackson.Serialization instead of JsonMethods
+import org.json4s.jackson.Serialization


Can we shim just this aspect instead of copy-and-pasting 650 lines ?

Added shim for this. I missed it earlier since it is in tests module. Initially I had updated to make it to compile but later forgot to shim it.

gerashegalov · 2026-03-10T21:50:51Z

tests/src/test/spark321/scala/com/nvidia/spark/rapids/shims/EventLogJsonShims.scala

+{"spark": "358"}
+{"spark": "400"}
+{"spark": "401"}
+{"spark": "402"}


Getting lost in the tags. Which StreamingShims will cover 410,411?

This is a dead code at this point. This is handled in FileStreamSinkShims - 411/FileStreamSinkShims
Added 400db173 shim to this.

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

copy function Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

nartal1 · 2026-03-11T22:07:04Z

build

NvTimLiu · 2026-03-12T01:37:52Z

jenkins/databricks/build.sh

        # and Databricks 15.4 are both based on spark version 3.5.0
        BUILDVER="$BUILDVER$DBR_VER"
        SPARK_VERSION_TO_INSTALL_DATABRICKS_JARS="$SPARK_VERSION_TO_INSTALL_DATABRICKS_JARS-$DBR_VER"
+    elif [ $DBR_VER == '17.3' ]; then


NIT: better to merge 14.3 and 17.3 together like if [ $DBR_VER == '14.3' ] || [ $DBR_VER == '17.3' ]; then ...

NvTimLiu · 2026-03-12T01:50:34Z

jenkins/databricks/deploy.sh

-SCALA_VERSION=`mvn help:evaluate -q -pl dist -Dexpression=scala.binary.version -DforceStdout`
+# Determine Scala version from Spark version: Spark 4.x uses Scala 2.13, earlier uses 2.12
+if [[ "$BASE_SPARK_VERSION_TO_INSTALL_DATABRICKS_JARS" == 4.* ]]; then
+    SCALA_VERSION="2.13"


We should also define POM_FILE here for deploy.sh like what [jenkins/databricks/build.sh‎] does

nartal1 added 30 commits February 3, 2026 01:21

Basic infrastructure to include DB-17.3 and compile fixes

71e8539

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Share common shim code between spark411 and spark400db173

251aaca

- Share TryModeShim.scala (evalContext.evalMode handling) - Share TimeAddShims.scala (TimeAdd->TimestampAddInterval rename) - Both files moved to spark400db173 with updated metadata to support both versions

Remove unused imports from spark400db173 files

3dbf61f

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Share GpuArrowAggregatePythonExecMeta and fix XML comment issue

3fcf4d1

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Add spark400db173 support to GpuUnionExecShim

9938e61

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

fix GpuBraodcastShimsError

d876317

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Fix shuffle getReader API for DBR-17.3

4c0abef

Refactor getReader to getReaderImpl and use ShuffleManagerShims to handle version-specific shuffle reader signatures. Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Remove unused import

c987e0c

Refactor and fix warnings

90a55eb

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Fix 350 build error

a7da91c

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

fix spark400 build

6cf0424

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Fix GpuWindowInPandasExec build error

be6015e

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

fix spark400 build errors

9e41d11

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

fix build error for db173

027d4e7

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Refactor GpuScalarSubquery

cde5d18

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Refactor RapidsShuffleManagerShim

2478b74

Enable DBR-17.3 shim by default

01f8fdc

Implement functions in GpuSHuffleExchange

6336e8d

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

clean up script

96e8e2c

Fix scala style issue

2d8d363

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Simplify building the scala2.13 OSS Spark version

1e408d8

fix copyright year, extra lines, comments etc

8c55648

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Merge branch 'main' of github.com:NVIDIA/spark-rapids into databricks…

5134b48

…_173_support_backup

remove desupported shims

267a476

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

remove/refactor redundant files

c032f46

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

fix scala style warnings

75283ec

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

refactor code

700ec50

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

refactor code further

db1ca47

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

split/shim shimExpressions.scala

21c17d1

nartal1 added 3 commits March 6, 2026 11:03

Merge branch 'main' of github.com:NVIDIA/spark-rapids into databricks…

52b481f

…_173_support

Refactor RapidsShufflemanager to fix classLoader issue

12a66a8

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Include 400db143 shim in GpuParquetUtilsShims.scala

c0869e6

nartal1 added 3 commits March 6, 2026 13:22

Remove correct dist target

0c41555

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Merge branch 'main' of github.com:NVIDIA/spark-rapids into databricks…

c0ab81b

…_173_support

Merge branch 'databricks_173_support' of github.com:nartal1/spark-rap…

c17cb0a

…ids into databricks_173_support

Fix 358 shim

9f06b88

nartal1 marked this pull request as ready for review March 7, 2026 01:36

nartal1 requested review from a team and gerashegalov March 7, 2026 01:36

jihoonson reviewed Mar 10, 2026

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuMultiFileReader.scala Show resolved Hide resolved

...gin/src/main/spark400db173/scala/com/nvidia/spark/rapids/shims/QueryStageRowCountShims.scala Outdated Show resolved Hide resolved

shim-deps/pom.xml Show resolved Hide resolved

addressed review comments

2011022

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

nartal1 requested a review from jihoonson March 10, 2026 17:41

jihoonson previously approved these changes Mar 10, 2026

View reviewed changes

gerashegalov reviewed Mar 10, 2026

View reviewed changes

nartal1 added 5 commits March 11, 2026 18:24

address review comments for jenkins scripts

cce248e

refactor getRuntimeStatistics in QueryStageRowCountShims

f617d6a

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Update GpuShuffleExchangeExec to include updatedRepartitioningStatus in

f7640f2

copy function Signed-off-by: Niranjan Artal <nartal@nvidia.com>

Shim MetricsEventLogValidationSuite

de3cf11

remove dead code

198bfc7

Signed-off-by: Niranjan Artal <nartal@nvidia.com>

nartal1 dismissed jihoonson’s stale review via 198bfc7 March 11, 2026 21:38

nartal1 requested a review from gerashegalov March 11, 2026 21:47

Fix build error

77e7b12

NvTimLiu reviewed Mar 12, 2026

View reviewed changes

		// Databricks 17.3: Use jackson.Serialization instead of JsonMethods
		import org.json4s.jackson.Serialization

Conversation

nartal1 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Key Changes

Build Infrastructure

Shuffle API Changes

Adaptive Query Execution

Expression and Subquery Changes

Streaming and Package Reorganization

Build

Checklists

Uh oh!

nartal1 commented Mar 4, 2026

Uh oh!

nartal1 commented Mar 6, 2026

Uh oh!

nartal1 commented Mar 6, 2026

Uh oh!

nartal1 commented Mar 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nartal1 commented Mar 10, 2026

Uh oh!

jihoonson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nartal1 commented Mar 11, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nartal1 commented Mar 3, 2026 •

edited

Loading