Skip to content
Merged
149 changes: 110 additions & 39 deletions gpu/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ versions.
There are three types of hermetic toolkits configurations:

1) Recommended: [Repository rules use redistributions loaded from NVIDIA repositories](#supported-hermetic-cuda-cudnn-nvshmem-versions).

For full CUDA toolkit hermeticity, use CUDA User Mode Driver libraries loaded from NVIDIA repositories
by setting `--@cuda_driver//:include_cuda_umd_libs=true` (see [instructions](#configure-hermetic-cuda-user-mode-driver)).


2) [Repository rules use redistributions loaded from custom remote locations or
local files](#2-custom-cudacudnnnvshmem-archives-and-nccl-wheels).
local files](#2-custom-cudacudnnnvshmem-archives-and-nccl-wheels).

This option is recommended for testing custom/unreleases redistributions, or
redistributions previously loaded locally.
Expand Down Expand Up @@ -141,12 +141,12 @@ is specified in [third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl](https:
build:cuda --repo_env TF_NEED_CUDA=1
build:cuda --@rules_ml_toolchain//common:enable_cuda
```

To use Clang compiler for CUDA targets, set
`--@local_config_cuda//:cuda_compiler=clang`, for NVCC compiler set
`--@local_config_cuda//:cuda_compiler=nvcc` and `TF_NVCC_CLANG` environment
`--@local_config_cuda//:cuda_compiler=nvcc` and `TF_NVCC_CLANG` environment
variable.

```
build:build_cuda_with_clang --@local_config_cuda//:cuda_compiler=clang

Expand Down Expand Up @@ -222,12 +222,12 @@ UMD version should be compatible with KMD and CUDA Runtime versions.


- Supported Kernel Mode Driver and User Mode Driver version combinations:

Driver versions combination | Is supported
-------- | --------
-------- | --------
KMD > UMD | -
KMD <= UMD | +

- UMD and CUDA Runtime versions compatibility is described in
[NVIDIA documentation](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html#id6).

Expand Down Expand Up @@ -258,11 +258,11 @@ UMD version should be compatible with KMD and CUDA Runtime versions.
```

2. To select specific version of hermetic NCCL, set the
`HERMETIC_NCCL_VERFSION` environment variable. Use only supported versions.
`HERMETIC_NCCL_VERSION` environment variable. Use only supported versions.
You may set the environment
variables directly in your shell or in `.bazelrc` file as shown below:
```
build:cuda --repo_env=HERMETIC_NCCL_VERFSION="2.27.7"
build:cuda --repo_env=HERMETIC_NCCL_VERSION="2.27.7"
```

3. To select specific version of hermetic NVSHMEM, set the
Expand Down Expand Up @@ -329,19 +329,23 @@ The JSON files contain paths to individual redistributions for different OS
architectures.

1. Create `cuda_redist.json` and/or `cudnn_redist.json` and/or
`nvshmem_redist.json` files.
`nvshmem_redist.json` files.

`cuda_redist.json` show follow the format below:

```
{
"cuda_cccl": {
"linux-x86_64": {
"relative_path": "cuda_cccl-linux-x86_64-12.4.99-archive.tar.xz",
},
"linux-sbsa": {
"relative_path": "cuda_cccl-linux-sbsa-12.4.99-archive.tar.xz",
}
"linux-x86_64": {
"full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
"sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
"strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
},
"linux-sbsa": {
"full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
"sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
"strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
},
},
}
```
Expand Down Expand Up @@ -384,8 +388,10 @@ architectures.
}
```

The `relative_path` field can be replaced with `full_path` for the full URLs
and absolute local paths starting with `file:///`.
Note that `sha_256` and `strip_prefix` are optional.

`full_path` should be used for the full URLs and absolute local paths
starting with `file:///`.

2. In the downstream project dependent on `rules_ml_toolchain`, update the
hermetic cuda JSON repository call in `WORKSPACE` file. Both web links and
Expand Down Expand Up @@ -449,12 +455,16 @@ dependencies in Google ML projects.
```
_CUSTOM_CUDA_REDISTRIBUTIONS = {
"cuda_cccl": {
"linux-x86_64": {
"relative_path": "cuda_cccl-linux-x86_64-12.4.99-archive.tar.xz",
},
"linux-sbsa": {
"relative_path": "cuda_cccl-linux-sbsa-12.4.99-archive.tar.xz",
}
"linux-x86_64": {
"full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
"sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
"strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
},
"linux-sbsa": {
"full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
"sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
"strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
},
},
}
```
Expand Down Expand Up @@ -497,14 +507,27 @@ dependencies in Google ML projects.
}
```

The `relative_path` field can be replaced with `full_path` for the full URLs
and absolute local paths starting with `file:///`.
Note that `sha_256` and `strip_prefix` are optional.

`full_path` should be used for the full URLs and absolute local paths
starting with `file:///`.

2. In the same `WORKSPACE` file, pass the created dictionaries to the repository
rule. If the dictionaries contain relative paths to distributions, the path
rule.

If the dictionaries contain relative paths to distributions, the path
prefix should be updated in `cuda_redist_init_repositories()`,
`cudnn_redist_init_repository()` and `nvshmem_redist_init_repository()`
calls.

There is an option to customize BUILD templates when the custom
redistributions have different folder structure than default ones.
Note that `source_dirs` is mandatory, it's used for the scenarios described
[here](https://github.com/google-ml-infra/rules_ml_toolchain/blob/main/gpu/README.md#3-local-toolkit-installations-used-as-sources-for-hermetic-repositories).

If the templates for the scenarios above are different, you need to provide
them in `version_to_templates` under `local` key.

```
register_toolchains("@rules_ml_toolchain//cc:linux_x86_64_linux_x86_64_cuda")
register_toolchains("@rules_ml_toolchain//cc:linux_aarch64_linux_aarch64_cuda")
Expand All @@ -520,9 +543,30 @@ dependencies in Google ML projects.
"cuda_redist_init_repositories",
"cudnn_redist_init_repository",
)

_CCCL_BUILD_TEMPLATES = {
"cuda_cccl": {
"repo_name": "cuda_cccl",
"version_to_template": {
"13": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl_github.BUILD.tpl",
"12": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl_github.BUILD.tpl",
"11": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl_github.BUILD.tpl",
},
"local": {
"source_dirs": ["include", "lib"],
"version_to_template": {
"13": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl.BUILD.tpl",
"12": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl.BUILD.tpl",
"11": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl.BUILD.tpl",
},
},
},
}

cuda_redist_init_repositories(
cuda_redistributions = _CUSTOM_CUDA_REDISTRIBUTIONS,
cuda_redist_path_prefix = "file:///home/usr/Downloads/dists/",
redist_versions_to_build_templates = _CCCL_BUILD_TEMPLATES,
)
cudnn_redist_init_repository(
cudnn_redistributions = _CUSTOM_CUDNN_REDISTRIBUTIONS,
Expand Down Expand Up @@ -590,13 +634,17 @@ _CUDNN_JSON_DICT = {

_CUDA_DIST_DICT = {
"cuda_cccl": {
"linux-x86_64": {
"relative_path": "cuda_cccl-linux-x86_64-12.4.99-archive.tar.xz",
},
"linux-sbsa": {
"relative_path": "cuda_cccl-linux-sbsa-12.4.99-archive.tar.xz",
},
},
"linux-x86_64": {
"full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
"sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
"strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
},
"linux-sbsa": {
"full_path": "https://github.com/NVIDIA/cccl/archive/0d328e06c9fc78a216ec70df4917f7230a9c77e3.tar.gz",
"sha256": "c45dddfcebfc2d719e0c4cc6a874a4b50a751b90daba139699d3fc11708cf0ef",
"strip_prefix": "cccl-0d328e06c9fc78a216ec70df4917f7230a9c77e3",
},
},,
"libcusolver": {
"linux-x86_64": {
"full_path": "file:///usr/Downloads/dists/libcusolver-linux-x86_64-11.6.0.99-archive.tar.xz",
Expand All @@ -607,6 +655,25 @@ _CUDA_DIST_DICT = {
},
}

_CCCL_BUILD_TEMPLATES = {
"cuda_cccl": {
"repo_name": "cuda_cccl",
"version_to_template": {
"13": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl_github.BUILD.tpl",
"12": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl_github.BUILD.tpl",
"11": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl_github.BUILD.tpl",
},
"local": {
"source_dirs": ["include", "lib"],
"version_to_template": {
"13": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl.BUILD.tpl",
"12": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl.BUILD.tpl",
"11": "@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_cccl.BUILD.tpl",
},
},
},
}

_CUDNN_DIST_DICT = {
"cudnn": {
"linux-x86_64": {
Expand Down Expand Up @@ -655,9 +722,14 @@ load(
"cuda_redist_init_repositories",
"cudnn_redist_init_repository",
)
cudnn_redist_init_repositories(
load(
"@rules_ml_toolchain//third_party/gpus/cuda/hermetic:cuda_redist_versions.bzl",
"REDIST_VERSIONS_TO_BUILD_TEMPLATES",
)
cuda_redist_init_repositories(
cuda_redistributions = CUDA_REDISTRIBUTIONS | _CUDA_DIST_DICT,
cuda_redist_path_prefix = "file:///usr/Downloads/dists/",
redist_versions_to_build_templates = REDIST_VERSIONS_TO_BUILD_TEMPLATES | _CCCL_BUILD_TEMPLATES,
)
cudnn_redist_init_repository(
cudnn_redistributions = CUDNN_REDISTRIBUTIONS | _CUDNN_DIST_DICT,
Expand Down Expand Up @@ -748,5 +820,4 @@ The structure of the folders inside NVSHMEM dir should be the following:
include/
lib/
bin/
```

```
10 changes: 6 additions & 4 deletions gpu/cuda/cuda_redist_init_repositories.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ load(
"cuda_redist_init_repositories_wrapper",
"cudnn_redist_init_repository_wrapper",
)

load(
"//third_party/gpus/cuda/hermetic:cuda_redist_versions.bzl",
"CUDA_REDIST_PATH_PREFIX",
Expand All @@ -39,14 +38,17 @@ def cudnn_redist_init_repository(
cudnn_redistributions,
cudnn_redist_path_prefix,
mirrored_tar_cudnn_redist_path_prefix,
redist_versions_to_build_templates)
redist_versions_to_build_templates,
)

def cuda_redist_init_repositories(
cuda_redistributions,
cuda_redist_path_prefix = CUDA_REDIST_PATH_PREFIX,
mirrored_tar_cuda_redist_path_prefix = MIRRORED_TAR_CUDA_REDIST_PATH_PREFIX,
redist_versions_to_build_templates = REDIST_VERSIONS_TO_BUILD_TEMPLATES):
cuda_redist_init_repositories_wrapper(cuda_redistributions,
cuda_redist_init_repositories_wrapper(
cuda_redistributions,
cuda_redist_path_prefix,
mirrored_tar_cuda_redist_path_prefix,
redist_versions_to_build_templates)
redist_versions_to_build_templates,
)
5 changes: 4 additions & 1 deletion gpu/nccl/nccl_redist_init_repository.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,7 @@ load(
def nccl_redist_init_repository(
cuda_nccl_wheels = CUDA_NCCL_WHEELS,
redist_versions_to_build_templates = REDIST_VERSIONS_TO_BUILD_TEMPLATES):
nccl_redist_init_repository_wrapper(cuda_nccl_wheels, redist_versions_to_build_templates)
nccl_redist_init_repository_wrapper(
cuda_nccl_wheels,
redist_versions_to_build_templates,
)
1 change: 1 addition & 0 deletions third_party/extensions/cuda_redist_init.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ def _cuda_redist_init_ext_impl(mctx):
cudnn_redistributions = CUDNN_REDISTRIBUTIONS,
)

# TODO(ybaturina): add missing features from workspace mode
cuda_redist_init_ext = module_extension(
implementation = _cuda_redist_init_ext_impl,
)
82 changes: 82 additions & 0 deletions third_party/gpus/cuda/hermetic/cuda_cccl_github.BUILD.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
licenses(["restricted"]) # NVIDIA proprietary license

filegroup(
name = "header_list",
srcs = [":thrust_header_list",":nv_header_list", ":cuda_header_list", ":cub_header_list"],
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

cc_library(
name = "headers",
deps = [":thrust_headers",":nv_headers", ":cuda_headers", ":cub_headers"],
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

filegroup(
name = "thrust_header_list",
srcs = glob([
%{comment}"thrust/thrust/**",
]),
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

cc_library(
name = "thrust_headers",
hdrs = [":thrust_header_list"],
include_prefix = "third_party/gpus/cuda/include",
includes = ["thrust"],
strip_include_prefix = "thrust",
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

filegroup(
name = "cuda_header_list",
srcs = glob([
%{comment}"libcudacxx/include/cuda/**",
]),
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

cc_library(
name = "cuda_headers",
hdrs = [":cuda_header_list"],
include_prefix = "third_party/gpus/cuda/include",
includes = ["libcudacxx/include"],
strip_include_prefix = "libcudacxx/include",
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

filegroup(
name = "nv_header_list",
srcs = glob([
%{comment}"libcudacxx/include/nv/**",
]),
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

cc_library(
name = "nv_headers",
hdrs = ["nv_header_list"],
include_prefix = "third_party/gpus/cuda/include",
includes = ["libcudacxx/include/nv"],
strip_include_prefix = "libcudacxx/include",
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

filegroup(
name = "cub_header_list",
srcs = glob([
%{comment}"cub/cub/**",
]),
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

cc_library(
name = "cub_headers",
hdrs = [":cub_header_list"],
include_prefix = "third_party/gpus/cuda/include",
includes = ["cub"],
strip_include_prefix = "cub",
visibility = ["@local_config_cuda//cuda:__pkg__"],
)

Loading
Loading