Skip to content

Adapt remaining SQL plugin callers to Java helpers#15034

Open
gerashegalov wants to merge 2 commits into
codex/unshim-stack-02p-sql-rapids-callersfrom
codex/unshim-stack-02q-remaining-callers
Open

Adapt remaining SQL plugin callers to Java helpers#15034
gerashegalov wants to merge 2 commits into
codex/unshim-stack-02p-sql-rapids-callersfrom
codex/unshim-stack-02q-remaining-callers

Conversation

@gerashegalov

@gerashegalov gerashegalov commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Related to #14834.

Description

This PR is one reviewable layer in the unshim stack introduced by #15025. It updates the remaining SQL plugin callers to use the moved Java helpers. This closes out the caller migration before the common shim-helper cleanup begins.

Stack context

Testing and validation notes

  • No standalone behavior change is intended in this layer. It is covered by the full-stack packaging/build validation described in Add default common unshim packaging flow #15025 and the existing tests for the affected subsystem.
  • The full split stack was verified to be tree-equivalent to the pre-split stack top.

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Covered by the validation notes in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch from b7a2605 to f6e6ac4 Compare June 10, 2026 20:49
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch 2 times, most recently from e895b00 to f061f44 Compare June 10, 2026 21:13
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch from f6e6ac4 to 34804ee Compare June 10, 2026 21:13
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from f061f44 to 6f4f246 Compare June 10, 2026 21:32
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch 2 times, most recently from 19f0ac5 to bd7c6fb Compare June 10, 2026 21:36
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from 6f4f246 to 5f9363a Compare June 10, 2026 21:36
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch from bd7c6fb to 86e001e Compare June 10, 2026 22:20
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch 2 times, most recently from 1457c07 to 64a238d Compare June 10, 2026 22:37
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch 2 times, most recently from 505e2e5 to 7cc1e13 Compare June 10, 2026 22:41
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from 1f90438 to 6011c26 Compare June 10, 2026 22:46
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch from 7cc1e13 to 0ae48f8 Compare June 10, 2026 22:46
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from 6011c26 to fea9cb1 Compare June 10, 2026 23:12
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch from 0ae48f8 to 4356413 Compare June 10, 2026 23:12
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from fea9cb1 to 7160eb5 Compare June 10, 2026 23:15
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch 2 times, most recently from 179e067 to 81b157f Compare June 10, 2026 23:33
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from 7160eb5 to 3c8769b Compare June 10, 2026 23:33
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch 2 times, most recently from 31fb4c7 to a9e05a6 Compare June 10, 2026 23:59
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch 2 times, most recently from 7bbdcc0 to d35c3a9 Compare June 11, 2026 00:25
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch from a9e05a6 to 2db5bfe Compare June 11, 2026 00:25
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from d35c3a9 to cf518aa Compare June 11, 2026 00:37
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch from 2db5bfe to 084c315 Compare June 11, 2026 00:37
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from cf518aa to f714012 Compare June 11, 2026 00:51
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch 2 times, most recently from 7440c9d to 32c9b07 Compare June 11, 2026 01:18
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from 63a0477 to 4f0fb18 Compare June 11, 2026 01:32
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch 2 times, most recently from 3987773 to 0202d1d Compare June 11, 2026 01:43
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from e69ce28 to bc1ad84 Compare June 11, 2026 01:58
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch 3 times, most recently from cd6b7b9 to d158664 Compare June 13, 2026 12:13
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02p-sql-rapids-callers branch from 3848638 to e505feb Compare June 13, 2026 12:20
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02q-remaining-callers branch from d158664 to 11f74bf Compare June 13, 2026 12:20
@gerashegalov gerashegalov marked this pull request as ready for review June 13, 2026 12:49
@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR is one layer of the unshim stack: it migrates the remaining SQL plugin callers off Spark's private Logging mixin (replaced with direct slf4j.LoggerFactory or the project's own RapidsLocalLog) and off Scala case class definitions that blocked cross-module Java interop (replaced with plain class + explicit val fields and Serializable).

  • All ProfileMsg subclasses, LoreRDD*, AggAndReplace, BoundGpuWindowFunction, ParsedBoundary, GpuUnboundedToUnboundedAggStages, and the window exec helper classes are migrated from case class to class; pattern-match sites updated to use class-based matching with field accessors.
  • DateTimeUtilsShims.currentTimestamp is inlined as a local Instant.now() computation, and the hasAliasQuoteFix shim branch in GpuAlias.sql is removed since every supported Spark version returns true for that flag.
  • ParquetCachedBatchSerializer adds the Buffer cast for ByteBuffer.position() to restore Java 8 / 9+ cross-version compatibility.

Confidence Score: 4/5

  • Safe to review in the stack context; the only actionable concern is that GpuLiteralShim is referenced but not yet defined, so this layer cannot compile standalone.
  • The bulk of the change is a mechanical, well-scoped migration — logging traits, case-class-to-class conversions, and pattern-match sites. All examined call sites are consistent with the new APIs. The one concrete problem is that GpuLiteral now extends GpuLiteralShim, a trait imported from com.nvidia.spark.rapids.shims that has no definition anywhere in the current HEAD (no Scala file, no Java file, no shim version directory). Any build that compiles this layer independently will fail. Because this is an intentional forward reference to the next stack layer (Move common shim helpers to shared sources #15052), it only blocks merging this PR before that one lands.
  • sql-plugin/src/main/scala/com/nvidia/spark/rapids/literals.scala — the unresolved GpuLiteralShim import needs attention before this layer can be built in isolation.

Important Files Changed

Filename Overview
sql-plugin/src/main/scala/com/nvidia/spark/rapids/literals.scala GpuLiteral changed to extend GpuLiteralShim (removing jsonFields override), but GpuLiteralShim is not defined anywhere in the current HEAD — forward dependency on next stack layer.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/profiler.scala ProfileMsg case classes removed (moved to Java in sql-plugin-columnar); Scala callers updated to use class-based pattern matching and Java method access; logging migrated from Spark Logging mixin to slf4j directly.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/window/BasicWindowCalc.scala AggAndReplace, BoundGpuWindowFunction, ParsedBoundary migrated from case class to class with explicit equals/hashCode; behaviour is consistent with prior case-class semantics for array fields.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/DateUtils.scala currentTimestamp shim replaced with a local Instant.now()-based implementation; result semantics (UTC microseconds since epoch) match the shim's behaviour.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/namedExpressions.scala Removed hasAliasQuoteFix shim branch from GpuAlias.sql; safe because every supported Spark version (3.2.0+) returns true for that flag.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/sparkRapidsListeners.scala SparkRapidsBuildInfoEvent and SparkRapidsShuffleDiskSavingsEvent migrated from case class to class; the default parameters for the disk-savings event were removed — the single existing call site in ShuffleCleanupManager already passes all arguments explicitly.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/parquet/ParquetCachedBatchSerializer.scala ByteBuffer.position() calls cast to Buffer for Java 9+ covariant-return compatibility; CloseableColumnBatchIterator migrated from case class to class.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/NvtxWithMetrics.scala MetricRange primary constructor default for excludeMetric replaced by an auxiliary constructor; all existing call sites verified to be compatible.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/lore/GpuLore.scala LoreRDDMeta and LoreRDDPartitionMeta migrated from case class to class with Serializable; logging migrated from Spark Logging to slf4j; all object constructions updated correctly.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/window/GpuWindowExpression.scala BatchedRunningWindowFixer subclasses migrated from Spark Logging to RapidsLocalLog; AggAndReplace construction updated to use new keyword.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Before["Before (Spark private Logging + case class)"]
        A["extends Logging\n(Spark internal)"] --> B["case class ProfileInitMsg\ncase class LoreRDDMeta\ncase class AggAndReplace\ncase class BoundGpuWindowFunction\n..."]
        B --> C["Pattern match via\nunapply / case class deconstruct"]
    end

    subgraph After["After (slf4j + Java-friendly class)"]
        D["RapidsLocalLog trait\nor inline slf4j.LoggerFactory"] --> E["class ProfileInitMsg (Java)\nclass LoreRDDMeta\nclass AggAndReplace\nclass BoundGpuWindowFunction\n..."]
        E --> F["Pattern match via\ncase msg: Foo =>  msg.field"]
    end

    subgraph Missing["Forward dependency (HEAD)"]
        G["GpuLiteral extends GpuLiteralShim"] -.->|"GpuLiteralShim not yet defined\n(expected in PR #15052)"| H["⚠ Compilation gap"]
    end

    Before --> After
Loading

Reviews (1): Last reviewed commit: "Update Spark 4.1 generated support metad..." | Re-trigger Greptile

* In order to do type conversion and checking, use GpuLiteral.create() instead of constructor.
*/
case class GpuLiteral (value: Any, dataType: DataType) extends GpuLeafExpression {
case class GpuLiteral (value: Any, dataType: DataType) extends GpuLiteralShim {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Forward reference to undefined GpuLiteralShim

GpuLiteralShim is imported from com.nvidia.spark.rapids.shims and GpuLiteral now extends it, but a search across the entire repository (all Scala and Java sources, all shim version directories) finds no definition of this trait anywhere in the current HEAD. Building this layer in isolation — or any branch that cherry-picks or rebases onto it before PR #15052 lands — will fail with an unresolved symbol error. If the definition is intentionally deferred to the "common shim-helper cleanup" next layer, a // TODO defined in #15052 comment or a temporary stub would make the dependency explicit.

Comment on lines 53 to +57
val ONE_SECOND_MICROSECONDS = 1000000

private def currentTimestampMicros: Long = {
val instant = Instant.now()
instant.getEpochSecond * ONE_SECOND_MICROSECONDS + instant.getNano / 1000

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Plain arithmetic instead of overflow-safe multiply

Spark's own instantToMicros uses Math.multiplyExact and Math.addExact to guard against overflow. The replacement here uses plain * and +. For timestamps within a practical range the result is identical, but it silently returns a wrong (wrapped) value rather than throwing if a pathological timestamp is ever passed, while the shim it replaces would have thrown. Worth noting since this replaces a shim that was previously calling through to Spark's overflow-checked path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant