From 84efb4e2f2c4a66aaf62007cca09698e6505261a Mon Sep 17 00:00:00 2001 From: liyuan <84758614+nvliyuan@users.noreply.github.com> Date: Sat, 10 Aug 2024 15:31:49 +0800 Subject: [PATCH] [DOC] update doc for 2408 release [skip CI] (#11296) * update doc for 2408 release Signed-off-by: liyuan * update doc for 2408 release Signed-off-by: liyuan * update doc for 2408 release Signed-off-by: liyuan * update doc for 2408 release Signed-off-by: liyuan * remove the spark4 and db14.3 shim Signed-off-by: liyuan * Update docs/download.md Co-authored-by: Jason Lowe * add improving get_json performance Signed-off-by: liyuan --------- Signed-off-by: liyuan Co-authored-by: Jason Lowe --- docs/archive.md | 84 ++++++++++++++++++++++++++++++++++++++++++++++++ docs/download.md | 30 +++++++++-------- 2 files changed, 100 insertions(+), 14 deletions(-) diff --git a/docs/archive.md b/docs/archive.md index 1fc6cdd05e2..de5cbe077b3 100644 --- a/docs/archive.md +++ b/docs/archive.md @@ -5,6 +5,90 @@ nav_order: 15 --- Below are archived releases for RAPIDS Accelerator for Apache Spark. +## Release v24.06.1 +### Hardware Requirements: + +The plugin is tested on the following architectures: + + GPU Models: NVIDIA V100, T4, A10/A100, L4 and H100 GPUs + +### Software Requirements: + + OS: Ubuntu 20.04, Ubuntu 22.04, CentOS 7, or Rocky Linux 8 + + NVIDIA Driver*: R470+ + + Runtime: + Scala 2.12, 2.13 + Python, Java Virtual Machine (JVM) compatible with your spark-version. + + * Check the Spark documentation for Python and Java version compatibility with your specific + Spark version. For instance, visit `https://spark.apache.org/docs/3.4.1` for Spark 3.4.1. + + Supported Spark versions: + Apache Spark 3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.2.4 + Apache Spark 3.3.0, 3.3.1, 3.3.2, 3.3.3, 3.3.4 + Apache Spark 3.4.0, 3.4.1, 3.4.2, 3.4.3 + Apache Spark 3.5.0, 3.5.1 + + Supported Databricks runtime versions for Azure and AWS: + Databricks 11.3 ML LTS (GPU, Scala 2.12, Spark 3.3.0) + Databricks 12.2 ML LTS (GPU, Scala 2.12, Spark 3.3.2) + Databricks 13.3 ML LTS (GPU, Scala 2.12, Spark 3.4.1) + + Supported Dataproc versions (Debian/Ubuntu): + GCP Dataproc 2.0 + GCP Dataproc 2.1 + + Supported Dataproc Serverless versions: + Spark runtime 1.1 LTS + Spark runtime 2.0 + Spark runtime 2.1 + Spark runtime 2.2 + +*Some hardware may have a minimum driver version greater than R470. Check the GPU spec sheet +for your hardware's minimum driver version. + +*For Cloudera and EMR support, please refer to the +[Distributions](https://docs.nvidia.com/spark-rapids/user-guide/latest/faq.html#which-distributions-are-supported) section of the FAQ. + +### RAPIDS Accelerator's Support Policy for Apache Spark +The RAPIDS Accelerator maintains support for Apache Spark versions available for download from [Apache Spark](https://spark.apache.org/downloads.html) + +### Download RAPIDS Accelerator for Apache Spark v24.06.1 + +| Processor | Scala Version | Download Jar | Download Signature | +|-----------|---------------|--------------|--------------------| +| x86_64 | Scala 2.12 | [RAPIDS Accelerator v24.06.1](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.06.1/rapids-4-spark_2.12-24.06.1.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.06.1/rapids-4-spark_2.12-24.06.1.jar.asc) | +| x86_64 | Scala 2.13 | [RAPIDS Accelerator v24.06.1](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.06.1/rapids-4-spark_2.13-24.06.1.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.06.1/rapids-4-spark_2.13-24.06.1.jar.asc) | +| arm64 | Scala 2.12 | [RAPIDS Accelerator v24.06.1](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.06.1/rapids-4-spark_2.12-24.06.1-cuda11-arm64.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.06.1/rapids-4-spark_2.12-24.06.1-cuda11-arm64.jar.asc) | +| arm64 | Scala 2.13 | [RAPIDS Accelerator v24.06.1](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.06.1/rapids-4-spark_2.13-24.06.1-cuda11-arm64.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.06.1/rapids-4-spark_2.13-24.06.1-cuda11-arm64.jar.asc) | + +This package is built against CUDA 11.8. It is tested on V100, T4, A10, A100, L4 and H100 GPUs with +CUDA 11.8 through CUDA 12.0. + +### Verify signature +* Download the [PUB_KEY](https://keys.openpgp.org/search?q=sw-spark@nvidia.com). +* Import the public key: `gpg --import PUB_KEY` +* Verify the signature for Scala 2.12 jar: + `gpg --verify rapids-4-spark_2.12-24.06.1.jar.asc rapids-4-spark_2.12-24.06.1.jar` +* Verify the signature for Scala 2.13 jar: + `gpg --verify rapids-4-spark_2.13-24.06.1.jar.asc rapids-4-spark_2.13-24.06.1.jar` + +The output of signature verify: + + gpg: Good signature from "NVIDIA Spark (For the signature of spark-rapids release jars) " + +### Release Notes +* Improve support for Unity Catalog on Databricks +* Added support for parse_url PATH +* Added support for array_filter +* Added support for Spark 3.4.3 +* For updates on RAPIDS Accelerator Tools, please visit [this link](https://github.com/NVIDIA/spark-rapids-tools/releases) + +For a detailed list of changes, please refer to the +[CHANGELOG](https://github.com/NVIDIA/spark-rapids/blob/main/CHANGELOG.md). + ## Release v24.06.0 ### Hardware Requirements: diff --git a/docs/download.md b/docs/download.md index a12ca8ba914..257428d4485 100644 --- a/docs/download.md +++ b/docs/download.md @@ -18,7 +18,7 @@ cuDF jar, that is either preinstalled in the Spark classpath on all nodes or sub that uses the RAPIDS Accelerator For Apache Spark. See the [getting-started guide](https://docs.nvidia.com/spark-rapids/user-guide/latest/getting-started/overview.html) for more details. -## Release v24.06.1 +## Release v24.08.0 ### Hardware Requirements: The plugin is tested on the following architectures: @@ -49,9 +49,9 @@ The plugin is tested on the following architectures: Databricks 12.2 ML LTS (GPU, Scala 2.12, Spark 3.3.2) Databricks 13.3 ML LTS (GPU, Scala 2.12, Spark 3.4.1) - Supported Dataproc versions (Debian/Ubuntu): - GCP Dataproc 2.0 + Supported Dataproc versions (Debian/Ubuntu/Rocky): GCP Dataproc 2.1 + GCP Dataproc 2.2 Supported Dataproc Serverless versions: Spark runtime 1.1 LTS @@ -68,14 +68,14 @@ for your hardware's minimum driver version. ### RAPIDS Accelerator's Support Policy for Apache Spark The RAPIDS Accelerator maintains support for Apache Spark versions available for download from [Apache Spark](https://spark.apache.org/downloads.html) -### Download RAPIDS Accelerator for Apache Spark v24.06.1 +### Download RAPIDS Accelerator for Apache Spark v24.08.0 | Processor | Scala Version | Download Jar | Download Signature | |-----------|---------------|--------------|--------------------| -| x86_64 | Scala 2.12 | [RAPIDS Accelerator v24.06.1](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.06.1/rapids-4-spark_2.12-24.06.1.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.06.1/rapids-4-spark_2.12-24.06.1.jar.asc) | -| x86_64 | Scala 2.13 | [RAPIDS Accelerator v24.06.1](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.06.1/rapids-4-spark_2.13-24.06.1.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.06.1/rapids-4-spark_2.13-24.06.1.jar.asc) | -| arm64 | Scala 2.12 | [RAPIDS Accelerator v24.06.1](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.06.1/rapids-4-spark_2.12-24.06.1-cuda11-arm64.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.06.1/rapids-4-spark_2.12-24.06.1-cuda11-arm64.jar.asc) | -| arm64 | Scala 2.13 | [RAPIDS Accelerator v24.06.1](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.06.1/rapids-4-spark_2.13-24.06.1-cuda11-arm64.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.06.1/rapids-4-spark_2.13-24.06.1-cuda11-arm64.jar.asc) | +| x86_64 | Scala 2.12 | [RAPIDS Accelerator v24.08.0](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.08.0/rapids-4-spark_2.12-24.08.0.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.08.0/rapids-4-spark_2.12-24.08.0.jar.asc) | +| x86_64 | Scala 2.13 | [RAPIDS Accelerator v24.08.0](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.08.0/rapids-4-spark_2.13-24.08.0.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.08.0/rapids-4-spark_2.13-24.08.0.jar.asc) | +| arm64 | Scala 2.12 | [RAPIDS Accelerator v24.08.0](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.08.0/rapids-4-spark_2.12-24.08.0-cuda11-arm64.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.12/24.08.0/rapids-4-spark_2.12-24.08.0-cuda11-arm64.jar.asc) | +| arm64 | Scala 2.13 | [RAPIDS Accelerator v24.08.0](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.08.0/rapids-4-spark_2.13-24.08.0-cuda11-arm64.jar) | [Signature](https://repo1.maven.org/maven2/com/nvidia/rapids-4-spark_2.13/24.08.0/rapids-4-spark_2.13-24.08.0-cuda11-arm64.jar.asc) | This package is built against CUDA 11.8. It is tested on V100, T4, A10, A100, L4 and H100 GPUs with CUDA 11.8 through CUDA 12.0. @@ -84,19 +84,21 @@ CUDA 11.8 through CUDA 12.0. * Download the [PUB_KEY](https://keys.openpgp.org/search?q=sw-spark@nvidia.com). * Import the public key: `gpg --import PUB_KEY` * Verify the signature for Scala 2.12 jar: - `gpg --verify rapids-4-spark_2.12-24.06.1.jar.asc rapids-4-spark_2.12-24.06.1.jar` + `gpg --verify rapids-4-spark_2.12-24.08.0.jar.asc rapids-4-spark_2.12-24.08.0.jar` * Verify the signature for Scala 2.13 jar: - `gpg --verify rapids-4-spark_2.13-24.06.1.jar.asc rapids-4-spark_2.13-24.06.1.jar` + `gpg --verify rapids-4-spark_2.13-24.08.0.jar.asc rapids-4-spark_2.13-24.08.0.jar` The output of signature verify: gpg: Good signature from "NVIDIA Spark (For the signature of spark-rapids release jars) " ### Release Notes -* Improve support for Unity Catalog on Databricks -* Added support for parse_url PATH -* Added support for array_filter -* Added support for Spark 3.4.3 +* Support timezones with daylight savings shifts +* Improve metrics in Spark UI +* Refactor Parquet decode microkernels and support load balancing RLE runs +* Improve get_json performance +* Support dynamic scan filtering +* Improve UCX shuffle * For updates on RAPIDS Accelerator Tools, please visit [this link](https://github.com/NVIDIA/spark-rapids-tools/releases) For a detailed list of changes, please refer to the