Skip to content

Commit

Permalink
Constrain versions of PyTorch and CI artifacts in CI Runs, upgrade to…
Browse files Browse the repository at this point in the history
… dgl 2.4 (#4690)

We were pulling the wrong packages because the PyTorch version constraint wasn't tight enough.  Hopefully these sorts of issues will be resolved in the `cugraph-gnn` repository going forward, where we can pin a specific pytorch version for testing.

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - https://github.com/jakirkham
  - Brad Rees (https://github.com/BradReesWork)
  - Rick Ratzel (https://github.com/rlratzel)

URL: #4690
  • Loading branch information
alexbarghi-nv authored Oct 7, 2024
1 parent 5fad435 commit 3789b70
Show file tree
Hide file tree
Showing 16 changed files with 72 additions and 68 deletions.
34 changes: 17 additions & 17 deletions ci/build_docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ set -euo pipefail
rapids-logger "Create test conda environment"
. /opt/conda/etc/profile.d/conda.sh

export RAPIDS_VERSION="$(rapids-version)"
export RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"
export RAPIDS_VERSION_NUMBER="$RAPIDS_VERSION_MAJOR_MINOR"

rapids-dependency-file-generator \
--output conda \
--file-key docs \
Expand All @@ -22,35 +26,31 @@ PYTHON_CHANNEL=$(rapids-download-conda-from-s3 python)

if [[ "${RAPIDS_CUDA_VERSION}" == "11.8.0" ]]; then
CONDA_CUDA_VERSION="11.8"
DGL_CHANNEL="dglteam/label/cu118"
DGL_CHANNEL="dglteam/label/th23_cu118"
else
CONDA_CUDA_VERSION="12.1"
DGL_CHANNEL="dglteam/label/cu121"
DGL_CHANNEL="dglteam/label/th23_cu121"
fi

rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
--channel conda-forge \
--channel pyg \
--channel nvidia \
--channel "${DGL_CHANNEL}" \
libcugraph \
pylibcugraph \
cugraph \
cugraph-pyg \
cugraph-dgl \
cugraph-service-server \
cugraph-service-client \
libcugraph_etl \
pylibcugraphops \
pylibwholegraph \
pytorch \
"libcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pylibcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph-pyg=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph-dgl=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph-service-server=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph-service-client=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"libcugraph_etl=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pylibcugraphops=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pylibwholegraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pytorch>=2.3,<2.4" \
"cuda-version=${CONDA_CUDA_VERSION}"

export RAPIDS_VERSION="$(rapids-version)"
export RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"
export RAPIDS_VERSION_NUMBER="$RAPIDS_VERSION_MAJOR_MINOR"
export RAPIDS_DOCS_DIR="$(mktemp -d)"

for PROJECT in libcugraphops libwholegraph; do
Expand Down
3 changes: 1 addition & 2 deletions ci/build_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ if [[ ${RAPIDS_CUDA_MAJOR} == "11" ]]; then
--no-test \
--channel "${CPP_CHANNEL}" \
--channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}" \
--channel pyg \
--channel pytorch \
--channel pytorch-nightly \
conda/recipes/cugraph-pyg
Expand All @@ -71,7 +70,7 @@ if [[ ${RAPIDS_CUDA_MAJOR} == "11" ]]; then
--no-test \
--channel "${CPP_CHANNEL}" \
--channel "${RAPIDS_CONDA_BLD_OUTPUT_DIR}" \
--channel dglteam \
--channel dglteam/label/th23_cu118 \
--channel pytorch \
--channel pytorch-nightly \
conda/recipes/cugraph-dgl
Expand Down
6 changes: 5 additions & 1 deletion ci/test_cpp.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../

. /opt/conda/etc/profile.d/conda.sh

RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"

rapids-logger "Generate C++ testing dependencies"
rapids-dependency-file-generator \
--output conda \
Expand All @@ -30,7 +32,9 @@ rapids-print-env

rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
libcugraph libcugraph_etl libcugraph-tests
"libcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"libcugraph_etl=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"libcugraph-tests=${RAPIDS_VERSION_MAJOR_MINOR}.*"

rapids-logger "Check GPU usage"
nvidia-smi
Expand Down
6 changes: 5 additions & 1 deletion ci/test_notebooks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ set -Eeuo pipefail

. /opt/conda/etc/profile.d/conda.sh

RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"

rapids-logger "Generate notebook testing dependencies"
rapids-dependency-file-generator \
--output conda \
Expand All @@ -27,7 +29,9 @@ PYTHON_CHANNEL=$(rapids-download-conda-from-s3 python)
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
libcugraph pylibcugraph cugraph
"libcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pylibcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*"

NBTEST="$(realpath "$(dirname "$0")/utils/nbtest.sh")"
NOTEBOOK_LIST="$(realpath "$(dirname "$0")/notebook_list.py")"
Expand Down
39 changes: 17 additions & 22 deletions ci/test_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ cd "$(dirname "$(realpath "${BASH_SOURCE[0]}")")"/../

. /opt/conda/etc/profile.d/conda.sh

RAPIDS_VERSION_MAJOR_MINOR="$(rapids-version-major-minor)"

rapids-logger "Generate Python testing dependencies"
rapids-dependency-file-generator \
--output conda \
Expand All @@ -34,12 +36,12 @@ rapids-print-env
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
libcugraph \
pylibcugraph \
cugraph \
nx-cugraph \
cugraph-service-server \
cugraph-service-client
"libcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pylibcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"nx-cugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph-service-server=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph-service-client=${RAPIDS_VERSION_MAJOR_MINOR}.*"

rapids-logger "Check GPU usage"
nvidia-smi
Expand Down Expand Up @@ -151,14 +153,13 @@ if [[ "${RAPIDS_CUDA_VERSION}" == "11.8.0" ]]; then
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
--channel conda-forge \
--channel dglteam/label/cu118 \
--channel dglteam/label/th23_cu118 \
--channel nvidia \
libcugraph \
pylibcugraph \
pylibcugraphops \
cugraph \
cugraph-dgl \
'dgl>=1.1.0.cu*,<=2.0.0.cu*' \
"libcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pylibcugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pylibcugraphops=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"cugraph-dgl=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
'pytorch>=2.3,<2.4' \
'cuda-version=11.8'

Expand Down Expand Up @@ -208,16 +209,10 @@ if [[ "${RAPIDS_CUDA_VERSION}" == "11.8.0" ]]; then
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
--channel pyg \
"cugraph-pyg" \
"cugraph-pyg=${RAPIDS_VERSION_MAJOR_MINOR}.*" \
"pytorch>=2.3,<2.4" \
"ogb"

pip install \
pyg_lib \
torch_scatter \
torch_sparse \
-f ${PYG_URL}

rapids-print-env

rapids-logger "pytest cugraph_pyg (single GPU)"
Expand Down Expand Up @@ -253,7 +248,7 @@ if [[ "${RAPIDS_CUDA_VERSION}" == "11.8.0" ]]; then
--channel "${PYTHON_CHANNEL}" \
--channel conda-forge \
--channel nvidia \
cugraph-equivariant
"cugraph-equivariant=${RAPIDS_VERSION_MAJOR_MINOR}.*"
pip install e3nn==0.5.1

rapids-print-env
Expand Down
4 changes: 2 additions & 2 deletions ci/test_wheel_cugraph-dgl.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,10 @@ else
PYTORCH_CUDA_VER=$PKG_CUDA_VER
fi
PYTORCH_URL="https://download.pytorch.org/whl/cu${PYTORCH_CUDA_VER}"
DGL_URL="https://data.dgl.ai/wheels/cu${PYTORCH_CUDA_VER}/repo.html"
DGL_URL="https://data.dgl.ai/wheels/torch-2.3/cu${PYTORCH_CUDA_VER}/repo.html"

rapids-logger "Installing PyTorch and DGL"
rapids-retry python -m pip install torch==2.3.0 --index-url ${PYTORCH_URL}
rapids-retry python -m pip install dgl==2.0.0 --find-links ${DGL_URL}
rapids-retry python -m pip install dgl==2.4.0 --find-links ${DGL_URL}

python -m pytest python/cugraph-dgl/tests
3 changes: 1 addition & 2 deletions conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pyg
- dglteam/label/cu118
- dglteam/label/th23_cu118
- conda-forge
- nvidia
dependencies:
Expand Down
3 changes: 1 addition & 2 deletions conda/environments/all_cuda-125_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pyg
- dglteam/label/cu118
- dglteam/label/th23_cu118
- conda-forge
- nvidia
dependencies:
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/cugraph-dgl/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ requirements:
- setuptools>=61.0.0
run:
- cugraph ={{ version }}
- dgl >=1.1.0.cu*
- dgl >=2.4.0.th23.cu*
- numba >=0.57
- numpy >=1.23,<3.0a0
- pylibcugraphops ={{ minor_version }}
Expand Down
2 changes: 1 addition & 1 deletion conda/recipes/cugraph-pyg/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ requirements:
- cugraph ={{ version }}
- pylibcugraphops ={{ minor_version }}
- tensordict >=0.1.2
- pyg >=2.5,<2.6
- pytorch_geometric >=2.5,<2.6

tests:
imports:
Expand Down
7 changes: 3 additions & 4 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -323,8 +323,7 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pyg
- dglteam/label/cu118
- dglteam/label/th23_cu118
- conda-forge
- nvidia
dependencies:
Expand Down Expand Up @@ -700,7 +699,7 @@ dependencies:
- &pytorch_conda pytorch>=2.3,<2.4.0a0
- pytorch-cuda==11.8
- &tensordict tensordict>=0.1.2
- dgl>=1.1.0.cu*
- dgl>=2.4.0.cu*
cugraph_pyg_dev:
common:
- output_types: [conda]
Expand All @@ -709,7 +708,7 @@ dependencies:
- *pytorch_conda
- pytorch-cuda==11.8
- *tensordict
- pyg>=2.5,<2.6
- pytorch_geometric>=2.5,<2.6

depends_on_pytorch:
common:
Expand Down
9 changes: 6 additions & 3 deletions docs/cugraph/source/graph_support/DGL_support.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,12 @@

Install and update cugraph-dgl and the required dependencies using the command:

```
conda install mamba -n base -c conda-forge
mamba install cugraph-dgl -c rapidsai-nightly -c rapidsai -c pytorch -c conda-forge -c nvidia -c dglteam
```shell
# CUDA 11
conda install -c rapidsai -c pytorch -c conda-forge -c nvidia -c dglteam/label/th23_cu118 cugraph-dgl

# CUDA 12
conda install -c rapidsai -c pytorch -c conda-forge -c nvidia -c dglteam/label/th23_cu121 cugraph-dgl
```

## Build from Source
Expand Down
3 changes: 2 additions & 1 deletion docs/cugraph/source/wholegraph/installation/container.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ RUN pip3 install Cython setuputils3 scikit-build nanobind pytest-forked pytest
To run GNN applications, you may also need cuGraphOps, DGL and/or PyG libraries to run the GNN layers.
You may refer to [DGL](https://www.dgl.ai/pages/start.html) or [PyG](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html)
For example, to install DGL, you may need to add:

```dockerfile
RUN pip3 install dgl -f https://data.dgl.ai/wheels/cu118/repo.html
RUN pip3 install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu118/repo.html
```
9 changes: 6 additions & 3 deletions python/cugraph-dgl/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,12 @@

Install and update cugraph-dgl and the required dependencies using the command:

```
conda install mamba -n base -c conda-forge
mamba install cugraph-dgl -c rapidsai-nightly -c rapidsai -c pytorch -c conda-forge -c nvidia -c dglteam
```shell
# CUDA 11
conda install -c rapidsai -c pytorch -c conda-forge -c nvidia -c dglteam/label/th23_cu118 cugraph-dgl

# CUDA 12
conda install -c rapidsai -c pytorch -c conda-forge -c nvidia -c dglteam/label/th23_cu121 cugraph-dgl
```

## Build from Source
Expand Down
5 changes: 2 additions & 3 deletions python/cugraph-dgl/conda/cugraph_dgl_dev_cuda-118.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,12 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pyg
- dglteam/label/cu118
- dglteam/label/th23_cu118
- conda-forge
- nvidia
dependencies:
- cugraph==24.10.*,>=0.0.0a0
- dgl>=1.1.0.cu*
- dgl>=2.4.0.cu*
- pandas
- pre-commit
- pylibcugraphops==24.10.*,>=0.0.0a0
Expand Down
5 changes: 2 additions & 3 deletions python/cugraph-pyg/conda/cugraph_pyg_dev_cuda-118.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,21 @@ channels:
- rapidsai
- rapidsai-nightly
- dask/label/dev
- pyg
- dglteam/label/cu118
- dglteam/label/th23_cu118
- conda-forge
- nvidia
dependencies:
- cugraph==24.10.*,>=0.0.0a0
- pandas
- pre-commit
- pyg>=2.5,<2.6
- pylibcugraphops==24.10.*,>=0.0.0a0
- pytest
- pytest-benchmark
- pytest-cov
- pytest-xdist
- pytorch-cuda==11.8
- pytorch>=2.3,<2.4.0a0
- pytorch_geometric>=2.5,<2.6
- scipy
- tensordict>=0.1.2
name: cugraph_pyg_dev_cuda-118

0 comments on commit 3789b70

Please sign in to comment.