Skip to content

Commit d5acb6b

Browse files
authored
Merge pull request #8133 from NvTimLiu/release-tmp
Merge branch 'branch-23.04' to main
2 parents 9b37954 + 448207f commit d5acb6b

File tree

645 files changed

+17936
-6994
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

645 files changed

+17936
-6994
lines changed

.github/workflows/auto-merge.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ name: auto-merge HEAD to BASE
1818
on:
1919
pull_request_target:
2020
branches:
21-
- branch-23.02
21+
- branch-23.04
2222
types: [closed]
2323

2424
jobs:
@@ -29,13 +29,13 @@ jobs:
2929
steps:
3030
- uses: actions/checkout@v3
3131
with:
32-
ref: branch-23.02 # force to fetch from latest upstream instead of PR ref
32+
ref: branch-23.04 # force to fetch from latest upstream instead of PR ref
3333

3434
- name: auto-merge job
3535
uses: ./.github/workflows/auto-merge
3636
env:
3737
OWNER: NVIDIA
3838
REPO_NAME: spark-rapids
39-
HEAD: branch-23.02
40-
BASE: branch-23.04
39+
HEAD: branch-23.04
40+
BASE: branch-23.06
4141
AUTOMERGE_TOKEN: ${{ secrets.AUTOMERGE_TOKEN }} # use to merge PR

.github/workflows/blossom-ci.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2020-2022, NVIDIA CORPORATION.
1+
# Copyright (c) 2020-2023, NVIDIA CORPORATION.
22
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
@@ -96,10 +96,10 @@ jobs:
9696
java-version: 8
9797

9898
# add blackduck properties https://synopsys.atlassian.net/wiki/spaces/INTDOCS/pages/631308372/Methods+for+Configuring+Analysis#Using-a-configuration-file
99+
# currently hardcode projects here to avoid intermittent mvn scan failures
99100
- name: Setup blackduck properties
100101
run: |
101-
PROJECTS=$(mvn -am dependency:tree | grep maven-dependency-plugin | awk '{ out="com.nvidia:"$(NF-1);print out }' | grep rapids | xargs | sed -e 's/ /,/g')
102-
echo detect.maven.build.command="-pl=$PROJECTS -am" >> application.properties
102+
echo detect.maven.build.command="-pl=com.nvidia:rapids-4-spark-parent,com.nvidia:rapids-4-spark-sql_2.12 -am" >> application.properties
103103
echo detect.maven.included.scopes=compile >> application.properties
104104
105105
- name: Run blossom action

.github/workflows/mvn-verify-check.yml

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2022, NVIDIA CORPORATION.
1+
# Copyright (c) 2022-2023, NVIDIA CORPORATION.
22
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
@@ -46,7 +46,10 @@ jobs:
4646
. jenkins/version-def.sh
4747
svArrBodyNoSnapshot=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":false}" "${SPARK_SHIM_VERSIONS_NOSNAPSHOTS_TAIL[@]}")
4848
svArrBodyNoSnapshot=${svArrBodyNoSnapshot:1}
49-
svArrBodySnapshot=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":true}" "${SPARK_SHIM_VERSIONS_SNAPSHOTS_ONLY[@]}")
49+
# do not add empty snapshot versions
50+
if [ ${#SPARK_SHIM_VERSIONS_SNAPSHOTS_ONLY[@]} -gt 0 ]; then
51+
svArrBodySnapshot=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":true}" "${SPARK_SHIM_VERSIONS_SNAPSHOTS_ONLY[@]}")
52+
fi
5053
5154
# add snapshot versions which are not in snapshot property in pom file
5255
svArrBodySnapshot+=$(printf ",{\"spark-version\":\"%s\",\"isSnapshot\":true}" 340)

CHANGELOG.md

Lines changed: 218 additions & 1 deletion
Large diffs are not rendered by default.

CONTRIBUTING.md

Lines changed: 63 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ To this end in a pre-production build you can set the Boolean property
152152

153153
The time saved is more significant if you are merely changing
154154
the `aggregator` module, or the `dist` module, or just incorporating changes from
155-
[spark-rapids-jni](https://github.com/NVIDIA/spark-rapids-jni/blob/branch-23.02/CONTRIBUTING.md#local-testing-of-cross-repo-contributions-cudf-spark-rapids-jni-and-spark-rapids)
155+
[spark-rapids-jni](https://github.com/NVIDIA/spark-rapids-jni/blob/branch-23.04/CONTRIBUTING.md#local-testing-of-cross-repo-contributions-cudf-spark-rapids-jni-and-spark-rapids)
156156

157157
For example, to quickly repackage `rapids-4-spark` after the
158158
initial `./build/buildall` you can iterate by invoking
@@ -186,22 +186,38 @@ The following acronyms may appear in directory names:
186186
|cdh |Cloudera CDH|321cdh |Cloudera CDH Spark based on Apache Spark 3.2.1|
187187

188188
The version-specific directory names have one of the following forms / use cases:
189-
- `src/main/312/scala` contains Scala source code for a single Spark version, 3.1.2 in this case
190-
- `src/main/312+-apache/scala`contains Scala source code for *upstream* **Apache** Spark builds,
189+
190+
#### Version range directories
191+
192+
The following source directory system is deprecated. See below and [shimplify.md][1]
193+
194+
* `src/main/312/scala` contains Scala source code for a single Spark version, 3.1.2 in this case
195+
* `src/main/312+-apache/scala`contains Scala source code for *upstream* **Apache** Spark builds,
191196
only beginning with version Spark 3.1.2, and + signifies there is no upper version boundary
192197
among the supported versions
193-
- `src/main/311until320-all` contains code that applies to all shims between 3.1.1 *inclusive*,
198+
* `src/main/311until320-all` contains code that applies to all shims between 3.1.1 *inclusive*,
194199
3.2.0 *exclusive*
195-
- `src/main/pre320-treenode` contains shims for the Catalyst `TreeNode` class before the
200+
* `src/main/pre320-treenode` contains shims for the Catalyst `TreeNode` class before the
196201
[children trait specialization in Apache Spark 3.2.0](https://issues.apache.org/jira/browse/SPARK-34906).
197-
- `src/main/post320-treenode` contains shims for the Catalyst `TreeNode` class after the
202+
* `src/main/post320-treenode` contains shims for the Catalyst `TreeNode` class after the
198203
[children trait specialization in Apache Spark 3.2.0](https://issues.apache.org/jira/browse/SPARK-34906).
199204

200205
For each Spark shim, we use Ant path patterns to compute the property
201206
`spark${buildver}.sources` in [sql-plugin/pom.xml](./sql-plugin/pom.xml) that is
202207
picked up as additional source code roots. When possible path patterns are reused using
203208
the conventions outlined in the pom.
204209

210+
#### Simplified version directory structure
211+
212+
Going forward new shim files should be added under:
213+
214+
* `src/main/spark${buildver}`, example: `src/main/spark330db`
215+
* `src/test/spark${buildver}`, example: `src/test/spark340`
216+
217+
with a special shim descriptor as a Scala/Java comment. See [shimplify.md][1]
218+
219+
[1]: ./docs/dev/shimplify.md
220+
205221
### Setting up an Integrated Development Environment
206222

207223
Our project currently uses `build-helper-maven-plugin` for shimming against conflicting definitions of superclasses
@@ -238,7 +254,12 @@ Known Issues:
238254

239255
* There is a known issue that the test sources added via the `build-helper-maven-plugin` are not handled
240256
[properly](https://youtrack.jetbrains.com/issue/IDEA-100532). The workaround is to `mark` the affected folders
241-
such as `tests/src/test/320+-noncdh-nondb` manually as `Test Sources Root`
257+
such as
258+
259+
* `tests/src/test/320+-noncdh-nondb`
260+
* `tests/src/test/spark340`
261+
262+
manually as `Test Sources Root`
242263

243264
* There is a known issue where, even after selecting a different Maven profile in the Maven submenu,
244265
the source folders from a previously selected profile may remain active. As a workaround,
@@ -264,7 +285,7 @@ interested in. For example, to generate the Bloop projects for the Spark 3.2.0 d
264285
just for the production code run:
265286

266287
```shell script
267-
mvn install ch.epfl.scala:maven-bloop_2.13:1.4.9:bloopInstall -pl aggregator -am \
288+
mvn install ch.epfl.scala:bloop-maven-plugin:bloopInstall -pl aggregator -am \
268289
-DdownloadSources=true \
269290
-Dbuildver=320 \
270291
-DskipTests \
@@ -296,7 +317,7 @@ You can now open the spark-rapids as a
296317

297318
Read on for VS Code Scala Metals instructions.
298319

299-
# Bloop, Scala Metals, and Visual Studio Code
320+
#### Bloop, Scala Metals, and Visual Studio Code
300321

301322
_Last tested with 1.63.0-insider (Universal) Commit: bedf867b5b02c1c800fbaf4d6ce09cefba_
302323

@@ -338,6 +359,29 @@ jps -l
338359
72349 scala.meta.metals.Main
339360
```
340361

362+
##### Known Issues
363+
364+
###### java.lang.RuntimeException: boom
365+
366+
Metals background compilation process status appears to be resetting to 0% after reaching 99%
367+
and you see a peculiar error message [`java.lang.RuntimeException: boom`][1]. You can work around
368+
it by making sure Metals Server (Bloop client) and Bloop Server are both running on Java 11+.
369+
370+
1. To this end make sure that Bloop projects are generated using Java 11+
371+
372+
```bash
373+
JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 \
374+
mvn install ch.epfl.scala:bloop-maven-plugin:bloopInstall \
375+
-DdownloadSources=true \
376+
-Dbuildver=331 \
377+
-Dskip -DskipTests -Dmaven.javadoc.skip
378+
```
379+
380+
1. Add [`metals.javaHome`][2] to VSCode preferences to point to Java 11+.
381+
382+
[1]: https://github.com/sourcegraph/scip-java/blob/b7d268233f1a303f66b6d9804a68f64b1e5d7032/semanticdb-javac/src/main/java/com/sourcegraph/semanticdb_javac/SemanticdbTaskListener.java#L76
383+
384+
[2]: https://github.com/scalameta/metals-vscode/pull/644/files#diff-04bba6a35cad1c794cbbe677678a51de13441b7a6ee8592b7b50be1f05c6f626R132
341385
#### Other IDEs
342386
We welcome pull requests with tips how to setup your favorite IDE!
343387

@@ -481,6 +525,16 @@ You can confirm that the update actually has happened by either inspecting its e
481525
`git diff` first or simply reexecuting `git commit` right away. The second time no file
482526
modification should be triggered by the copyright year update hook and the commit should succeed.
483527
528+
There is a known issue for macOS users if they use the default version of `sed`. The copyright update
529+
script may fail and generate an unexpected file named `source-file-E`. As a workaround, please
530+
install GNU sed
531+
532+
```bash
533+
brew install gnu-sed
534+
# and add to PATH to make it as default sed for your shell
535+
export PATH="/usr/local/opt/gnu-sed/libexec/gnubin:$PATH"
536+
```
537+
484538
### Pull request status checks
485539
A pull request should pass all status checks before merged.
486540
#### signoff check

0 commit comments

Comments
 (0)