[auto-merge] release/26.06 to main [skip ci] [bot]#14916
Merged
Conversation
### Description When AQE is enabled, transition cleanup can expose a bare GPU exchange at the root of the final adaptive plan. That is valid while Spark is preparing AQE query stages because Spark needs to see the exchange node, but it is invalid once Spark is collecting rows from the final plan. On a proprietary Spark distro this showed up as `IllegalStateException: Row-based execution should not occur for GpuColumnarExchange` for repartition collect paths. This PR tags exchanges that are seen by `GpuQueryStagePrepOverrides` and copies that tag from CPU exchange nodes to the GPU exchange replacements. `optimizeAdaptiveTransitions` now exposes a root GPU exchange only when that tag is present, and otherwise keeps `GpuColumnarToRowExec` for final execution. The user-visible effect is that final AQE repartition results can be collected normally. There are no new configs or documented behavior changes. Test coverage added: - `AdaptiveQueryExecSuite`: `Keep transition to row for final AQE repartition exchange` Validation performed: - `mvn package -pl tests -am -Dbuildver=358 -DwildcardSuites=com.nvidia.spark.rapids.AdaptiveQueryExecSuite -Dtests="Keep transition to row for final AQE repartition exchange"` - Built and ran the affected `repart_test.py` subset in a proprietary Spark distro Docker image with AQE enabled and `NUM_LOCAL_EXECS=2`; 63 tests passed with 0 failures/errors - Confirmed the pre-fix parent still reproduces the original failure without setting `NUM_LOCAL_EXECS`; 24 of 63 selected cases failed with the `GpuColumnarExchange` row-execution error ### Checklists Documentation - [ ] Updated for new or modified user-facing features or behaviors - [x] No user-facing change Testing - [x] Added or modified tests to cover new code paths - [ ] Covered by existing tests (Please provide the names of the existing tests in the PR description.) - [ ] Not required Performance - [ ] Tests ran and results are added in the PR description - [ ] Issue filed with a link in the PR description - [x] Not required --------- Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
Collaborator
Author
|
SUCCESS - auto-merge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
auto-merge triggered by github actions on
release/26.06to create a PR keepingmainup-to-date. If this PR is unable to be merged due to conflicts, it will remain open until manually fix.