Skip to content

Commit 5cdf199

Browse files
Merge pull request #373 from SurajAralihalli/main-24.02-release
[Doc] Update readme, ipynb files for 24.02 version [skip ci]
2 parents e421cb3 + a3820d2 commit 5cdf199

File tree

40 files changed

+179
-95
lines changed

40 files changed

+179
-95
lines changed

.github/workflows/auto-merge.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2022-2023, NVIDIA CORPORATION.
1+
# Copyright (c) 2022-2024, NVIDIA CORPORATION.
22
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
@@ -18,7 +18,7 @@ name: auto-merge HEAD to BASE
1818
on:
1919
pull_request_target:
2020
branches:
21-
- branch-23.12
21+
- branch-24.02
2222
types: [closed]
2323

2424
jobs:
@@ -29,14 +29,14 @@ jobs:
2929
steps:
3030
- uses: actions/checkout@v3
3131
with:
32-
ref: branch-23.12 # force to fetch from latest upstream instead of PR ref
32+
ref: branch-24.02 # force to fetch from latest upstream instead of PR ref
3333

3434
- name: auto-merge job
3535
uses: ./.github/workflows/auto-merge
3636
env:
3737
OWNER: NVIDIA
3838
REPO_NAME: spark-rapids-examples
39-
HEAD: branch-23.12
40-
BASE: branch-24.02
39+
HEAD: branch-24.02
40+
BASE: branch-24.04
4141
AUTOMERGE_TOKEN: ${{ secrets.AUTOMERGE_TOKEN }} # use to merge PR
4242

.github/workflows/markdown-links-check.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright (c) 2022, NVIDIA CORPORATION.
1+
# Copyright (c) 2022-2024, NVIDIA CORPORATION.
22
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
@@ -30,6 +30,5 @@ jobs:
3030
with:
3131
max-depth: -1
3232
use-verbose-mode: 'yes'
33-
check-modified-files-only: 'yes'
3433
config-file: '.github/workflows/markdown-links-check/markdown-links-check-config.json'
3534
base-branch: 'main'

.github/workflows/markdown-links-check/markdown-links-check-config.json

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,18 @@
11
{
2+
"ignorePatterns": [
3+
{
4+
"pattern": "/docs"
5+
},
6+
{
7+
"pattern": "/datasets"
8+
},
9+
{
10+
"pattern": "/dockerfile"
11+
},
12+
{
13+
"pattern": "/examples"
14+
}
15+
],
216
"timeout": "15s",
317
"retryOn429": true,
418
"retryCount":30,

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ can be built for running on GPU with RAPIDS Accelerator in this repo:
3737
| 3 | XGBoost | Taxi (Scala) | End-to-end ETL + XGBoost example to predict taxi trip fare amount with [NYC taxi trips data set](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page)
3838
| 4 | ML/DL | PCA End-to-End | Spark MLlib based PCA example to train and transform with a synthetic dataset
3939
| 5 | UDF | cuSpatial - Point in Polygon | Spark cuSpatial example for Point in Polygon function using NYC Taxi pickup location dataset
40-
| 6 | UDF | URL Decode | Decodes URL-encoded strings using the [Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable/)
41-
| 7 | UDF | URL Encode | URL-encodes strings using the [Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/stable/)
40+
| 6 | UDF | URL Decode | Decodes URL-encoded strings using the [Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/legacy/)
41+
| 7 | UDF | URL Encode | URL-encodes strings using the [Java APIs of RAPIDS cudf](https://docs.rapids.ai/api/cudf-java/legacy/)
4242
| 8 | UDF | [CosineSimilarity](./examples/UDF-Examples/RAPIDS-accelerated-UDFs/src/main/java/com/nvidia/spark/rapids/udf/java/CosineSimilarity.java) | Computes the cosine similarity between two float vectors using [native code](./examples/UDF-Examples/RAPIDS-accelerated-UDFs/src/main/cpp/src)
4343
| 9 | UDF | [StringWordCount](./examples/UDF-Examples/RAPIDS-accelerated-UDFs/src/main/java/com/nvidia/spark/rapids/udf/hive/StringWordCount.java) | Implements a Hive simple UDF using [native code](./examples/UDF-Examples/RAPIDS-accelerated-UDFs/src/main/cpp/src) to count words in strings

docs/get-started/xgboost-examples/csp/databricks/databricks.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Navigate to your home directory in the UI and select **Create** > **File** from
2121
create an `init.sh` scripts with contents:
2222
```bash
2323
#!/bin/bash
24-
sudo wget -O /databricks/jars/rapids-4-spark_2.12-23.12.1.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.1/rapids-4-spark_2.12-23.12.1.jar
24+
sudo wget -O /databricks/jars/rapids-4-spark_2.12-24.02.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.02.0/rapids-4-spark_2.12-24.02.0.jar
2525
```
2626
1. Select the Databricks Runtime Version from one of the supported runtimes specified in the
2727
Prerequisites section.
@@ -68,7 +68,7 @@ create an `init.sh` scripts with contents:
6868
```bash
6969
spark.rapids.sql.python.gpu.enabled true
7070
spark.python.daemon.module rapids.daemon_databricks
71-
spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-23.12.1.jar:/databricks/spark/python
71+
spark.executorEnv.PYTHONPATH /databricks/jars/rapids-4-spark_2.12-24.02.0.jar:/databricks/spark/python
7272
```
7373
Note that since python memory pool require installing the cudf library, so you need to install cudf library in
7474
each worker nodes `pip install cudf-cu11 --extra-index-url=https://pypi.nvidia.com` or disable python memory pool

docs/get-started/xgboost-examples/csp/databricks/init.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-gpu_2.12--ml.dmlc__xgboost4j-gpu_2.12__1.5.2.jar
22
sudo rm -f /databricks/jars/spark--maven-trees--ml--10.x--xgboost-gpu--ml.dmlc--xgboost4j-spark-gpu_2.12--ml.dmlc__xgboost4j-spark-gpu_2.12__1.5.2.jar
33

4-
sudo wget -O /databricks/jars/rapids-4-spark_2.12-23.12.1.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.1/rapids-4-spark_2.12-23.12.1.jar
4+
sudo wget -O /databricks/jars/rapids-4-spark_2.12-24.02.0.jar https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.02.0/rapids-4-spark_2.12-24.02.0.jar
55
sudo wget -O /databricks/jars/xgboost4j-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-gpu_2.12/1.7.1/xgboost4j-gpu_2.12-1.7.1.jar
66
sudo wget -O /databricks/jars/xgboost4j-spark-gpu_2.12-1.7.1.jar https://repo1.maven.org/maven2/ml/dmlc/xgboost4j-spark-gpu_2.12/1.7.1/xgboost4j-spark-gpu_2.12-1.7.1.jar
77
ls -ltr

docs/get-started/xgboost-examples/on-prem-cluster/kubernetes-scala.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ export SPARK_DOCKER_IMAGE=<gpu spark docker image repo and name>
4040
export SPARK_DOCKER_TAG=<spark docker image tag>
4141

4242
pushd ${SPARK_HOME}
43-
wget https://github.com/NVIDIA/spark-rapids-examples/raw/branch-23.12/dockerfile/Dockerfile
43+
wget https://github.com/NVIDIA/spark-rapids-examples/raw/branch-24.02/dockerfile/Dockerfile
4444

4545
# Optionally install additional jars into ${SPARK_HOME}/jars/
4646

docs/get-started/xgboost-examples/prepare-package-data/preparation-python.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ For simplicity export the location to these jars. All examples assume the packag
55
### Download the jars
66

77
Download the RAPIDS Accelerator for Apache Spark plugin jar
8-
* [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.1/rapids-4-spark_2.12-23.12.1.jar)
8+
* [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.02.0/rapids-4-spark_2.12-24.02.0.jar)
99

1010
### Build XGBoost Python Examples
1111

docs/get-started/xgboost-examples/prepare-package-data/preparation-scala.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ For simplicity export the location to these jars. All examples assume the packag
55
### Download the jars
66

77
1. Download the RAPIDS Accelerator for Apache Spark plugin jar
8-
* [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/23.12.1/rapids-4-spark_2.12-23.12.1.jar)
8+
* [RAPIDS Spark Package](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.02.0/rapids-4-spark_2.12-24.02.0.jar)
99

1010
### Build XGBoost Scala Examples
1111

examples/ML+DL-Examples/Spark-DL/criteo_train/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ _Please note: The following demo is dedicated for DGX-2 machine(with V100 GPUs).
77
## Dataset
88

99
The dataset used here is from Criteo clicklog dataset.
10-
It's preprocessed by [DLRM](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Recommendation/DLRM/preproc)
10+
It's preprocessed by [DLRM](https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow2/Recommendation/DLRM_and_DCNv2/preproc)
1111
ETL job on Spark. We also provide a small size sample data in sample_data folder.
1212
All 40 columns(1 label + 39 features) are already numeric.
1313

0 commit comments

Comments
 (0)