Hermetic CUDA/CUDNN/NCCL/NVSHMEM use specific downloadable redistribution versions instead of the user’s locally installed packages. Bazel will download CUDA, CUDNN, NCCL and NVSHMEM redistributions, and then use libraries and tools as dependencies in various Bazel targets. This enables more reproducible builds for Google ML projects and supported CUDA versions.
There are three types of hermetic toolkits configurations:
-
Recommended: Repository rules use redistributions loaded from NVIDIA repositories.
-
Repository rules use redistributions loaded from custom remote locations or local files.
This option is recommended for testing custom/unreleases redistributions, or redistributions previously loaded locally.
-
Not recommended: Repository rules use locally-installed toolkits.
The supported CUDA versions are specified in CUDA_REDIST_JSON_DICT
dictionary,
third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl.
The supported CUDNN versions are specified in CUDNN_REDIST_JSON_DICT
dictionary,
third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl.
The supported NVSHMEM versions are specified in NVSHMEM_REDIST_JSON_DICT
dictionary,
third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl.
The .bazelrc files of individual projects have HERMETIC_CUDA_VERSION,
HERMETIC_CUDNN_VERSION, HERMETIC_NVSHMEM_VERSION environment variables set
to the versions used by default when --config=cuda is specified in Bazel
command options.
HERMETIC_CUDA_VERSION, HERMETIC_CUDNN_VERSION, HERMETIC_NVSHMEM_VERSION
environment variables should consist of major, minor and
patch redistribution version, e.g. 12.8.0.
Three ways to set the environment variables for Bazel commands:
# Add an entry to your `.bazelrc` file
build:cuda --repo_env=HERMETIC_CUDA_VERSION="12.8.0"
build:cuda --repo_env=HERMETIC_CUDNN_VERSION="9.8.0"
build:cuda --repo_env=HERMETIC_NVSHMEM_VERSION="3.2.5"
# OR pass it directly to your specific build command
bazel build --config=cuda <target> \
--repo_env=HERMETIC_CUDA_VERSION="12.8.0" \
--repo_env=HERMETIC_CUDNN_VERSION="9.8.0" \
--repo_env=HERMETIC_NVSHMEM_VERSION="3.2.5"
# If .bazelrc doesn't have corresponding entries and the environment variables
# are not passed to bazel command, you can set them globally in your shell:
export HERMETIC_CUDA_VERSION="12.8.0"
export HERMETIC_CUDNN_VERSION="9.8.0"
export HERMETIC_NVSHMEM_VERSION="3.2.5"
If HERMETIC_CUDA_VERSION and HERMETIC_CUDNN_VERSION are not present, the
hermetic CUDA/CUDNN repository rules will look up TF_CUDA_VERSION and
TF_CUDNN_VERSION environment variables values. This is made for the backward
compatibility with non-hermetic CUDA/CUDNN repository rules.
The mapping between CUDA version and NCCL distribution version to be downloaded is specified in third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl
-
In the downstream project dependent on
rules_ml_toolchain, add the following lines to theWORKSPACEfile:register_toolchains("@rules_ml_toolchain//cc:linux_x86_64_linux_x86_64_cuda") register_toolchains("@rules_ml_toolchain//cc:linux_aarch64_linux_aarch64_cuda") load( "@rules_ml_toolchain///gpu/cuda:cuda_json_init_repository.bzl", "cuda_json_init_repository", ) cuda_json_init_repository() load( "@cuda_redist_json//:distributions.bzl", "CUDA_REDISTRIBUTIONS", "CUDNN_REDISTRIBUTIONS", ) load( "@rules_ml_toolchain//gpu/cuda:cuda_redist_init_repositories.bzl", "cuda_redist_init_repositories", "cudnn_redist_init_repository", ) cuda_redist_init_repositories( cuda_redistributions = CUDA_REDISTRIBUTIONS, ) cudnn_redist_init_repository( cudnn_redistributions = CUDNN_REDISTRIBUTIONS, ) load( "@rules_ml_toolchain//gpu/cuda:cuda_configure.bzl", "cuda_configure", ) cuda_configure(name = "local_config_cuda") load( "@rules_ml_toolchain//gpu/nccl:nccl_redist_init_repository.bzl", "nccl_redist_init_repository", ) nccl_redist_init_repository() load( "@rules_ml_toolchain//gpu/nccl:nccl_configure.bzl", "nccl_configure", ) nccl_configure(name = "local_config_nccl") -
To enable CUDA, set
TF_NEED_CUDAenvironment variable and enable the flag--@rules_ml_toolchain//common:enable_cuda:build:cuda --repo_env TF_NEED_CUDA=1 build:cuda --@rules_ml_toolchain//common:enable_cudaTo use Clang compiler for CUDA targets, set
--@local_config_cuda//:cuda_compiler=clang, for NVCC compiler set--@local_config_cuda//:cuda_compiler=nvccandTF_NVCC_CLANGenvironment variable.build:build_cuda_with_clang --@local_config_cuda//:cuda_compiler=clang build:build_cuda_with_nvcc --action_env=TF_NVCC_CLANG="1" build:build_cuda_with_nvcc --@local_config_cuda//:cuda_compiler=nvcc -
To select specific versions of hermetic CUDA and CUDNN, set the
HERMETIC_CUDA_VERSIONandHERMETIC_CUDNN_VERSIONenvironment variables respectively. Use only supported versions. Also you need to specify the CUDA compute capabilities inHERMETIC_CUDA_COMPUTE_CAPABILITIESthat define the hardware features and supported instructions for GPU architecture.You may set the environment variables directly in your shell or in
.bazelrcfile as shown below:build:cuda --repo_env=HERMETIC_CUDA_VERSION="12.8.0" build:cuda --repo_env=HERMETIC_CUDNN_VERSION="9.8.0" build:cuda --repo_env=HERMETIC_CUDA_COMPUTE_CAPABILITIES="sm_50,sm_60,sm_70,sm_80,compute_90" -
To enable hermetic CUDA and NVSHMEM during test execution, or when running a binary via bazel, make sure to add
--@local_config_cuda//cuda:include_cuda_libs=trueflag to your bazel command. It is recommended to turn this flag on in all the cases except when you release a binary or a wheel. You can provide it either directly in a shell or in.bazelrc:build:cuda --@local_config_cuda//cuda:include_cuda_libs=trueThe flag is needed to make sure that CUDA dependencies are properly provided to test executables. The flag is false by default to avoid unwanted coupling of Google-released Python wheels to CUDA binaries.
-
To enforce CUDA forward compatibility mode, add
--@cuda_driver//:enable_forward_compatibility=trueflag to your bazel command. You can provide it either directly in a shell or in.bazelrc:test:cuda --@cuda_driver//:enable_forward_compatibility=trueThe default flag value is
false.When CUDA forward compatibility mode is disabled, Bazel targets will use User Mode and Kernel Mode Drivers pre-installed on the system.
When CUDA forward compatibility mode is enabled, Bazel targets will use User Mode Driver from CUDA driver redistribution downloaded into Bazel cache and Kernel Mode Driver pre-installed on the system. It allows enabling new CUDA Toolkit features while using older Kernel Mode Driver.
Forward compatibility mode should be enforced only when it is appropriate - see NVIDIA documentation for the details.
-
In the downstream project dependent on
rules_ml_toolchain, add the following lines to theWORKSPACEfile:load( "@rules_ml_toolchain//gpu/nvshmem:nvshmem_json_init_repository.bzl", "nvshmem_json_init_repository", ) nvshmem_json_init_repository() load( "@nvshmem_redist_json//:distributions.bzl", "NVSHMEM_REDISTRIBUTIONS", ) load( "@rules_ml_toolchain//gpu/nvshmem:nvshmem_redist_init_repository.bzl", "nvshmem_redist_init_repository", ) nvshmem_redist_init_repository( nvshmem_redistributions = NVSHMEM_REDISTRIBUTIONS, ) -
To select specific version of hermetic NVSHMEM, set the
HERMETIC_NVSHMEM_VERSIONenvironment variable. Use only supported versions. You may set the environment variables directly in your shell or in.bazelrcfile as shown below:build:cuda --repo_env=HERMETIC_NVSHMEM_VERSION="3.2.5"
-
Create and submit a pull request with updated
CUDA_REDIST_JSON_DICT,CUDNN_REDIST_JSON_DICT,NVSHMEM_REDIST_JSON_DICTdictionaries in third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl.Update
CUDA_NCCL_WHEELSin third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl if needed.Update
REDIST_VERSIONS_TO_BUILD_TEMPLATESin third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl if needed.Update
PTX_VERSION_DICTin third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl if needed. -
For each Google ML project create a separate pull request with updated
HERMETIC_CUDA_VERSION,HERMETIC_CUDNN_VERSION,HERMETIC_NVSHMEM_VERSIONin.bazelrcfile.The PR presubmit job executions will launch bazel tests and download hermetic CUDA/CUDNN/NVSHMEM distributions. Verify that the presubmit jobs passed before submitting the PR.
-
For the time optimization some build/test configurations utilize mirrored
.tarredistributions. Thejsonfile with information about the mirrored.tarredistributions is uploaded some time later afterCUDA_REDIST_JSON_DICT,CUDNN_REDIST_JSON_DICT,NVSHMEM_REDIST_JSON_DICTare updated. One can download these files usingwget "https://storage.googleapis.com/mirror.tensorflow.org/developer.download.nvidia.com/compute/cuda/redist/redistrib_<cuda_version>_tar.json"forCUDA,wget "https://storage.googleapis.com/mirror.tensorflow.org/developer.download.nvidia.com/compute/cudnn/redist/redistrib_<cudnn_version>_tar.json"forCUDNNandwget "https://developer.download.nvidia.com/compute/nvshmem/redist/redistrib_<nvshmem_version>_tar.json"forNVSHMEM. After that create and submit a pull request with updatedMIRRORED_TARS_CUDA_REDIST_JSON_DICT,MIRRORED_TARS_CUDNN_REDIST_JSON_DICT,MIRRORED_TARS_NVSHMEM_REDIST_JSON_DICTdictionaries in third_party/gpus/cuda/hermetic/cuda_redist_versions.bzl.
There are three options that allow usage of custom distributions.
This option allows to use custom distributions for all CUDA/CUDNN/NVSHMEM dependencies in Google ML projects.
The JSON files contain paths to individual redistributions for different OS architectures.
-
Create
cuda_redist.jsonand/orcudnn_redist.jsonand/ornvshmem_redist.jsonfiles.cuda_redist.jsonshow follow the format below:{ "cuda_cccl": { "linux-x86_64": { "relative_path": "cuda_cccl-linux-x86_64-12.4.99-archive.tar.xz", }, "linux-sbsa": { "relative_path": "cuda_cccl-linux-sbsa-12.4.99-archive.tar.xz", } }, }cudnn_redist.jsonshow follow the format below:{ "cudnn": { "linux-x86_64": { "cuda12": { "relative_path": "cudnn/linux-x86_64/cudnn-linux-x86_64-9.0.0.312_cuda12-archive.tar.xz", } }, "linux-sbsa": { "cuda12": { "relative_path": "cudnn/linux-sbsa/cudnn-linux-sbsa-9.0.0.312_cuda12-archive.tar.xz", } } } }nvshmem_redist.jsonshow follow the format below:{ "libnvshmem": { "linux-x86_64": { "cuda12": { "relative_path": "libnvshmem/linux-x86_64/libnvshmem-linux-x86_64-3.2.5_cuda12-archive.tar.xz", } }, "linux-sbsa": { "cuda12": { "relative_path": "libnvshmem/linux-sbsa/libnvshmem-linux-sbsa-3.2.5_cuda12-archive.tar.xz", } } } }The
relative_pathfield can be replaced withfull_pathfor the full URLs and absolute local paths starting withfile:///. -
In the downstream project dependent on
rules_ml_toolchain, update the hermetic cuda JSON repository call inWORKSPACEfile. Both web links and local file paths are allowed. Example:_CUDA_JSON_DICT = { "12.4.0": [ "file:///home/user/Downloads/redistrib_12.4.0_updated.json", ], } _CUDNN_JSON_DICT = { "9.0.0": [ "https://developer.download.nvidia.com/compute/cudnn/redist/redistrib_9.0.0.json", ], } cuda_json_init_repository( cuda_json_dict = _CUDA_JSON_DICT, cudnn_json_dict = _CUDNN_JSON_DICT, ) _NVSHMEM_JSON_DICT = { "3.2.5": [ "file:///home/user/Downloads/redistrib_3.2.5.json", ], } nvshmem_json_init_repository( nvshmem_json_dict = _NVSHMEM_JSON_DICT, )If JSON files contain relative paths to distributions, the path prefix should be updated in
cuda_redist_init_repositories(),cudnn_redist_init_repository(),nvshmem_redist_init_repository()calls. Example:cuda_redist_init_repositories( cuda_redistributions = CUDA_REDISTRIBUTIONS, cuda_redist_path_prefix = "file:///usr/Downloads/dists/", ) nvshmem_redist_init_repositories( nvshmem_redistributions = NVSHMEM_REDISTRIBUTIONS, nvshmem_redist_path_prefix = "file:///usr/Downloads/dists/", )
This option allows to use custom distributions for some CUDA/CUDNN/NVSHMEM dependencies in Google ML projects.
-
In the downstream project dependent on
rules_ml_toolchain, create dictionaries with distribution paths. The dictionary with CUDA distributions show follow the format below:_CUSTOM_CUDA_REDISTRIBUTIONS = { "cuda_cccl": { "linux-x86_64": { "relative_path": "cuda_cccl-linux-x86_64-12.4.99-archive.tar.xz", }, "linux-sbsa": { "relative_path": "cuda_cccl-linux-sbsa-12.4.99-archive.tar.xz", } }, }The dictionary with CUDNN distributions show follow the format below:
_CUSTOM_CUDNN_REDISTRIBUTIONS = { "cudnn": { "linux-x86_64": { "cuda12": { "relative_path": "cudnn/linux-x86_64/cudnn-linux-x86_64-9.0.0.312_cuda12-archive.tar.xz", } }, "linux-sbsa": { "cuda12": { "relative_path": "cudnn/linux-sbsa/cudnn-linux-sbsa-9.0.0.312_cuda12-archive.tar.xz", } } } }The dictionary with NVSHMEM distributions show follow the format below:
_CUSTOM_NVSHMEM_REDISTRIBUTIONS = { "libnvshmem": { "linux-x86_64": { "cuda12": { "relative_path": "libnvshmem/linux-x86_64/libnvshmem-linux-x86_64-3.2.5_cuda12-archive.tar.xz", } }, "linux-sbsa": { "cuda12": { "relative_path": "libnvshmem/linux-sbsa/libnvshmem-linux-sbsa-3.2.5_cuda12-archive.tar.xz", } } } }The
relative_pathfield can be replaced withfull_pathfor the full URLs and absolute local paths starting withfile:///. -
In the same
WORKSPACEfile, pass the created dictionaries to the repository rule. If the dictionaries contain relative paths to distributions, the path prefix should be updated incuda_redist_init_repositories(),cudnn_redist_init_repository()andnvshmem_redist_init_repository()calls.register_toolchains("@rules_ml_toolchain//cc:linux_x86_64_linux_x86_64_cuda") register_toolchains("@rules_ml_toolchain//cc:linux_aarch64_linux_aarch64_cuda") load( "@rules_ml_toolchain//gpu/cuda:cuda_json_init_repository.bzl", "cuda_json_init_repository", ) cuda_json_init_repository() load( "@rules_ml_toolchain//gpu/cuda:cuda_redist_init_repositories.bzl", "cuda_redist_init_repositories", "cudnn_redist_init_repository", ) cuda_redist_init_repositories( cuda_redistributions = _CUSTOM_CUDA_REDISTRIBUTIONS, cuda_redist_path_prefix = "file:///home/usr/Downloads/dists/", ) cudnn_redist_init_repository( cudnn_redistributions = _CUSTOM_CUDNN_REDISTRIBUTIONS, cudnn_redist_path_prefix = "file:///home/usr/Downloads/dists/cudnn/" ) load( "@rules_ml_toolchain//gpu/cuda:cuda_configure.bzl", "cuda_configure", ) cuda_configure(name = "local_config_cuda") load( "@rules_ml_toolchain//gpu/nccl:nccl_redist_init_repository.bzl", "nccl_redist_init_repository", ) nccl_redist_init_repository() load( "@rules_ml_toolchain//gpu/nccl:nccl_configure.bzl", "nccl_configure", ) nccl_configure(name = "local_config_nccl") load( "@rules_ml_toolchain//gpu/nvshmem:nvshmem_json_init_repository.bzl", "nvshmem_json_init_repository", ) nvshmem_json_init_repository() load( "@rules_ml_toolchain//gpu/nvshmem:nvshmem_redist_init_repository.bzl", "nvshmem_redist_init_repository", ) nvshmem_redist_init_repository( nvshmem_redistributions = _CUSTOM_NVSHMEM_REDISTRIBUTIONS, nvshmem_redist_path_prefix = "file:///home/usr/Downloads/dists/nvshmem/" )
In the example below, CUDA_REDIST_JSON_DICT is merged with custom JSON data in
_CUDA_JSON_DICT, and CUDNN_REDIST_JSON_DICT is merged with
_CUDNN_JSON_DICT.
The distributions data in _CUDA_DIST_DICT overrides the content of resulting
CUDA JSON file, and the distributions data in _CUDNN_DIST_DICT overrides the
content of resulting CUDNN JSON file. The NCCL wheels data is merged from
CUDA_NCCL_WHEELS and _NCCL_WHEEL_DICT.
_CUDA_JSON_DICT = {
"12.4.0": [
"file:///usr/Downloads/redistrib_12.4.0_updated.json",
],
}
_CUDNN_JSON_DICT = {
"9.0.0": [
"https://developer.download.nvidia.com/compute/cudnn/redist/redistrib_9.0.0.json",
],
}
_CUDA_DIST_DICT = {
"cuda_cccl": {
"linux-x86_64": {
"relative_path": "cuda_cccl-linux-x86_64-12.4.99-archive.tar.xz",
},
"linux-sbsa": {
"relative_path": "cuda_cccl-linux-sbsa-12.4.99-archive.tar.xz",
},
},
"libcusolver": {
"linux-x86_64": {
"full_path": "file:///usr/Downloads/dists/libcusolver-linux-x86_64-11.6.0.99-archive.tar.xz",
},
"linux-sbsa": {
"relative_path": "libcusolver-linux-sbsa-11.6.0.99-archive.tar.xz",
},
},
}
_CUDNN_DIST_DICT = {
"cudnn": {
"linux-x86_64": {
"cuda12": {
"relative_path": "cudnn-linux-x86_64-9.0.0.312_cuda12-archive.tar.xz",
},
},
"linux-sbsa": {
"cuda12": {
"relative_path": "cudnn-linux-sbsa-9.0.0.312_cuda12-archive.tar.xz",
},
},
},
}
_NCCL_WHEEL_DICT = {
"12.4.0": {
"x86_64-unknown-linux-gnu": {
"url": "https://files.pythonhosted.org/packages/38/00/d0d4e48aef772ad5aebcf70b73028f88db6e5640b36c38e90445b7a57c45/nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl",
},
},
}
load(
"@rules_ml_toolchain//gpu/cuda:cuda_redist_versions.bzl",
"CUDA_REDIST_PATH_PREFIX",
"CUDA_NCCL_WHEELS",
"CUDA_REDIST_JSON_DICT",
"CUDNN_REDIST_PATH_PREFIX",
"CUDNN_REDIST_JSON_DICT",
)
cuda_json_init_repository(
cuda_json_dict = CUDA_REDIST_JSON_DICT | _CUDA_JSON_DICT,
cudnn_json_dict = CUDNN_REDIST_JSON_DICT | _CUDNN_JSON_DICT,
)
load(
"@cuda_redist_json//:distributions.bzl",
"CUDA_REDISTRIBUTIONS",
"CUDNN_REDISTRIBUTIONS",
)
load(
"@rules_ml_toolchain//gpu/cuda:cuda_redist_init_repositories.bzl",
"cuda_redist_init_repositories",
"cudnn_redist_init_repository",
)
cudnn_redist_init_repositories(
cuda_redistributions = CUDA_REDISTRIBUTIONS | _CUDA_DIST_DICT,
cuda_redist_path_prefix = "file:///usr/Downloads/dists/",
)
cudnn_redist_init_repository(
cudnn_redistributions = CUDNN_REDISTRIBUTIONS | _CUDNN_DIST_DICT,
cudnn_redist_path_prefix = "file:///usr/Downloads/dists/cudnn/"
)
load(
"@rules_ml_toolchain//gpu/nccl:nccl_redist_init_repository.bzl",
"nccl_redist_init_repository",
)
nccl_redist_init_repository(
cuda_nccl_wheels = CUDA_NCCL_WHEELS | _NCCL_WHEEL_DICT,
)
3) Local toolkit installations used as sources for hermetic repositories {#local-toolkit-installation}
Warning
This feature exists solely to cover the use case when the same person develops both XLA/JAX and CUDA binaries, which is specific to NVIDIA teams. Everyone else, who does not build custom NVIDIA binaries should not be using this feature at all.
You can use the local CUDA/CUDNN/NCCL/NVSHMEM paths as a source of redistributions. The following additional environment variables are required:
LOCAL_CUDA_PATH
LOCAL_CUDNN_PATH
LOCAL_NCCL_PATH
LOCAL_NVSHMEM_PATH
Example:
# Add an entry to your `.bazelrc` file
build:cuda --repo_env=LOCAL_CUDA_PATH="/foo/bar/nvidia/cuda"
build:cuda --repo_env=LOCAL_CUDNN_PATH="/foo/bar/nvidia/cudnn"
build:cuda --repo_env=LOCAL_NCCL_PATH="/foo/bar/nvidia/nccl"
build:cuda --repo_env=LOCAL_NVSHMEM_PATH="/foo/bar/nvidia/nvshmem"
# OR pass it directly to your specific build command
bazel build --config=cuda <target> \
--repo_env=LOCAL_CUDA_PATH="/foo/bar/nvidia/cuda" \
--repo_env=LOCAL_CUDNN_PATH="/foo/bar/nvidia/cudnn" \
--repo_env=LOCAL_NCCL_PATH="/foo/bar/nvidia/nccl" \
--repo_env=LOCAL_NVSHMEM_PATH="/foo/bar/nvidia/nvshmem"
# If .bazelrc doesn't have corresponding entries and the environment variables
# are not passed to bazel command, you can set them globally in your shell:
export LOCAL_CUDA_PATH="/foo/bar/nvidia/cuda"
export LOCAL_CUDNN_PATH="/foo/bar/nvidia/cudnn"
export LOCAL_NCCL_PATH="/foo/bar/nvidia/nccl"
export LOCAL_NVSHMEM_PATH="/foo/bar/nvidia/nvshmem"
The structure of the folders inside CUDA/CUDNN/NCCL/NVSHMEM dirs should be the following (as if the archived redistributions were unpacked into one place):
<LOCAL_CUDA_PATH>/
include/
bin/
lib/
nvvm/
The structure of the folders inside CUDNN dir should be the following:
<LOCAL_CUDNN_PATH>
include/
lib/
The structure of the folders inside NCCL dir should be the following:
<LOCAL_NCCL_PATH>
include/
lib/
The structure of the folders inside NVSHMEM dir should be the following:
<LOCAL_NVSHMEM_PATH>
include/
lib/
bin/