fix(triton): Update default image to 25.02 for TensorFlow backend sup…#1327
fix(triton): Update default image to 25.02 for TensorFlow backend sup…#1327rpancham wants to merge 2 commits intoopendatahub-io:mainfrom
Conversation
…port Updates the default Triton Inference Server image from 24.10-py3 to 25.02-py3 to restore TensorFlow and Keras model serving functionality. Problem: Triton 26.x removed the TensorFlow backend, causing model loading failures with error: 'unable to find backend library for backend tensorflow' Root Cause: NVIDIA deprecated the TensorFlow backend in Triton 25.03 and removed it completely in 26.x. The backend library is physically absent from the container. Solution: Use Triton 25.02, the last official release with TensorFlow backend included. Changes: - Updated TRITON_IMAGE constant from 24.10-py3 to 25.02-py3 - Added comprehensive README documenting version compatibility - Documented migration paths for users needing Triton 26.x Testing: Integration tests required in CI/CD to verify all 7 model types deploy successfully with the updated image. References: - NVIDIA Triton Release Notes: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/ - TensorFlow Backend Deprecation: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel-25-03.html
|
The following are automatically added/executed:
Available user actions:
Supported labels{'/hold', '/cherry-pick', '/build-push-pr-image', '/lgtm', '/verified', '/wip'} |
for more information, see https://pre-commit.ci
📝 WalkthroughWalkthroughAdded a new README documenting the Triton Inference Server runtime tests (scope, supported frameworks/backends, execution commands, config assumptions, infra requirements, troubleshooting, maintenance). Also bumped the default Triton container image constant from Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/model_serving/model_runtime/triton/constant.py`:
- Line 37: The TRITON_IMAGE constant uses a mutable tag (TRITON_IMAGE) which
risks supply-chain drift; replace the tag
"nvcr.io/nvidia/tritonserver:25.02-py3" with the corresponding immutable image
reference pinned by digest (sha256) fetched from NVIDIA NGC release artifacts
(e.g., locate the official digest for 25.02-py3 on NGC docs or the image
registry and use "nvcr.io/nvidia/tritonserver@sha256:<DIGEST>"); update the
TRITON_IMAGE constant to that digest-pinned string and add a brief comment
referencing the NGC release page used to obtain the digest to ensure CI
reproducibility.
In `@tests/model_serving/model_runtime/triton/README.md`:
- Around line 23-176: The README has markdownlint MD022/MD031 violations due to
missing blank lines around headings and fenced code blocks (e.g., headings
"Current Default Version", "TensorFlow Backend Availability", "Using Triton 26.x
with TensorFlow Models", the fenced bash/python/pytest blocks, and the "Default
Image" code fence showing TRITON_IMAGE). Fix by ensuring there is exactly one
blank line before and after each top-level heading and before/after every fenced
code block (```bash, ```python, ```), and add blank lines surrounding the table
sections as needed so all headings and code fences are separated by blank lines
consistent with markdownlint rules.
- Around line 155-159: Update the fenced code block in the README.md that shows
the TensorFlow backend error so it declares a language identifier (e.g., change
the opening ``` to ```text) — locate the block containing the "E0326
10:10:41.293702 1 model_lifecycle.cc:654" / "unable to find backend library for
backend 'tensorflow'" error and add the language token to the opening fence to
satisfy markdownlint MD040 and improve rendering.
- Around line 46-49: The README currently instructs to run "git checkout r26.02"
against the tensorflow_backend repo which fails because that tag doesn't exist;
update the instructions to instead reference the official NGC container or the
build helper: advise using the prebuilt image
"nvcr.io/nvidia/tritonserver:26.02-py3" for reproducible builds, or if building
from source instruct users to run the repository's build.py with the repo tag
flag (e.g., "--repo-tag=backend:r26.02") so CMake FetchContent pulls the correct
backend commit rather than doing a direct "git checkout r26.02".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: bed5f521-755e-4bd5-8f9a-f608f0cb9a8f
📒 Files selected for processing (2)
tests/model_serving/model_runtime/triton/README.mdtests/model_serving/model_runtime/triton/constant.py
| # Updated to 25.02 - last Triton release with TensorFlow backend included by default | ||
| # TensorFlow backend was deprecated in 25.03 and removed in 26.x+ | ||
| # See: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/ | ||
| TRITON_IMAGE: str = "nvcr.io/nvidia/tritonserver:25.02-py3" |
There was a problem hiding this comment.
Pin Triton image by digest to prevent mutable-tag supply-chain drift (CWE-494).
Using :25.02-py3 alone allows upstream retagging, which can silently change tested binaries and invalidate reproducibility/security assumptions.
Suggested hardening
-TRITON_IMAGE: str = "nvcr.io/nvidia/tritonserver:25.02-py3"
+# Keep human-readable tag in comment, but enforce immutable digest.
+# Example digest shown below; replace with the verified 25.02-py3 digest from NGC.
+TRITON_IMAGE: str = "nvcr.io/nvidia/tritonserver@sha256:<verified-25.02-py3-digest>"Find the official immutable digest for `nvcr.io/nvidia/tritonserver:25.02-py3` on NVIDIA NGC docs/release artifacts, and confirm digest pinning guidance for CI reproducibility.
As per coding guidelines, "**: REVIEW PRIORITIES: 1. Security vulnerabilities (provide severity, exploit scenario, and remediation code)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/model_serving/model_runtime/triton/constant.py` at line 37, The
TRITON_IMAGE constant uses a mutable tag (TRITON_IMAGE) which risks supply-chain
drift; replace the tag "nvcr.io/nvidia/tritonserver:25.02-py3" with the
corresponding immutable image reference pinned by digest (sha256) fetched from
NVIDIA NGC release artifacts (e.g., locate the official digest for 25.02-py3 on
NGC docs or the image registry and use
"nvcr.io/nvidia/tritonserver@sha256:<DIGEST>"); update the TRITON_IMAGE constant
to that digest-pinned string and add a brief comment referencing the NGC release
page used to obtain the digest to ensure CI reproducibility.
| git clone https://github.com/triton-inference-server/tensorflow_backend | ||
| cd tensorflow_backend | ||
| git checkout r26.02 # Match your Triton version | ||
| # Follow build instructions in repository README |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does the triton-inference-server/tensorflow_backend project provide immutable commit references or signed tags for the r26.02 release, and what is the recommended secure checkout method?
💡 Result:
No, the triton-inference-server/tensorflow_backend project does not provide immutable commit references or signed tags specifically for the r26.02 release. The GitHub repository has no published releases, and searches did not identify a branch or tag named "r26.02". The main Triton server repository uses branches like r26.02 (e.g., git clone -b r26.02 https://github.com/triton-inference-server/server.git), but the tensorflow_backend repo defaults to "main" during builds without a specific r26.02 reference. The recommended secure checkout method for Triton components, including backends, is to use the official NGC Docker containers (e.g., nvcr.io/nvidia/tritonserver:26.02-py3), which include pre-built, verified backends corresponding to the release. For source builds, use the build.py script with --repo-tag=backend:r26.02 (or equivalent), which fetches the appropriate commit/branch from the backend repos during the CMake FetchContent process. Verify container integrity using NGC's signature verification tools. No explicit mention of GPG-signed tags or commits for backends was found; security relies on NGC containers and GitHub's integrity.
Citations:
- 1: https://github.com/triton-inference-server/tensorflow_backend/releases
- 2: https://github.com/triton-inference-server/tensorflow_backend
- 3: https://github.com/ROCm/triton-inference-server-tensorflow_backend
- 4: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel-26-02.html
- 5: https://github.com/triton-inference-server/server
- 6: https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/tensorflow_backend/README.html
- 7: https://docs.nvidia.com/deeplearning/triton-inference-server/archives/triton-inference-server-2600/user-guide/docs/customization_guide/build.html
- 8: Error in Build of Triton server using --no-container-build triton-inference-server/server#3213
- 9: https://linkedin.com/in/ngocson2vn
Use NGC Docker containers or build.py instead of manual git checkout—r26.02 tag does not exist in tensorflow_backend repository.
The documented git checkout r26.02 will fail because that tag/branch is not present in triton-inference-server/tensorflow_backend. For reproducible, verified builds (CWE-494), use the official NGC pre-built container (nvcr.io/nvidia/tritonserver:26.02-py3). If building from source is required, use the build.py script with --repo-tag=backend:r26.02, which fetches the correct backend commit via CMake FetchContent. Direct git checkout is neither secure nor functional for this repository.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/model_serving/model_runtime/triton/README.md` around lines 46 - 49, The
README currently instructs to run "git checkout r26.02" against the
tensorflow_backend repo which fails because that tag doesn't exist; update the
instructions to instead reference the official NGC container or the build
helper: advise using the prebuilt image "nvcr.io/nvidia/tritonserver:26.02-py3"
for reproducible builds, or if building from source instruct users to run the
repository's build.py with the repo tag flag (e.g., "--repo-tag=backend:r26.02")
so CMake FetchContent pulls the correct backend commit rather than doing a
direct "git checkout r26.02".
| ``` | ||
| E0326 10:10:41.293702 1 model_lifecycle.cc:654] "failed to load 'model_name' version 1: | ||
| Invalid argument: unable to find backend library for backend 'tensorflow', | ||
| try specifying runtime on the model configuration." | ||
| ``` |
There was a problem hiding this comment.
Add a language identifier to the troubleshooting code fence (MD040).
The fenced block for the TensorFlow backend error should declare a language (e.g., text) to satisfy markdownlint and improve rendering.
Suggested fix
-```
+```text
E0326 10:10:41.293702 1 model_lifecycle.cc:654] "failed to load 'model_name' version 1:
Invalid argument: unable to find backend library for backend 'tensorflow',
try specifying runtime on the model configuration."</details>
<!-- suggestion_start -->
<details>
<summary>📝 Committable suggestion</summary>
> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
```suggestion
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)
[warning] 155-155: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
[warning] 155-155: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/model_serving/model_runtime/triton/README.md` around lines 155 - 159,
Update the fenced code block in the README.md that shows the TensorFlow backend
error so it declares a language identifier (e.g., change the opening ``` to
```text) — locate the block containing the "E0326 10:10:41.293702 1
model_lifecycle.cc:654" / "unable to find backend library for backend
'tensorflow'" error and add the language token to the opening fence to satisfy
markdownlint MD040 and improve rendering.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
tests/model_serving/model_runtime/triton/README.md (2)
46-52:⚠️ Potential issue | 🔴 CriticalNon-existent git tag will cause build failure (duplicate).
Line 50 instructs
git checkout r26.02, which does not exist in the tensorflow_backend repository. This issue was already flagged in previous review comments (lines 46-49).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/model_serving/model_runtime/triton/README.md` around lines 46 - 52, The README instructs checking out a non-existent git tag ("git checkout r26.02") in the tensorflow_backend repo; replace this with the correct tag or branch that matches your Triton version (or instruct users to run "git fetch --tags" and choose the matching release) so the tensorflow_backend clone can build; update the line referencing git checkout r26.02 to use the proper tag/branch name (or advise using the repository's main/release branch) and mention verifying the Triton version-to-backend tag mapping for the tensorflow_backend project.
171-176:⚠️ Potential issue | 🟡 MinorAdd language identifier to code fence at line 171 (MD040).
The code block is missing a language identifier. There are additional unfenced code blocks in this file at lines 88, 98, 106, 118, and 175.
Fix
-``` +```text E0326 10:10:41.293702 1 model_lifecycle.cc:654] "failed to load 'model_name' version 1: Invalid argument: unable to find backend library for backend 'tensorflow', try specifying runtime on the model configuration."</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@tests/model_serving/model_runtime/triton/README.mdaround lines 171 - 176,
Add a language identifier (e.g., "text") to the fenced code block that contains
the error message starting with E0326...model_lifecycle.cc:654 and also update
the other unfenced code blocks referenced (the blocks around lines with content
at 88, 98, 106, 118, and 175) so they use a fenced code block with a language
tag (usetext) to satisfy MD040; ensure each opening fence istext and the
corresponding closing fence remains ``` so rendering and linting are fixed.</details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@tests/model_serving/model_runtime/triton/README.md:
- Line 25: Replace the mutable image tag
nvcr.io/nvidia/tritonserver:25.02-py3
with an immutable digest-pinned reference and add maintenance notes documenting
vulnerability scanning; rundocker manifest inspect nvcr.io/nvidia/tritonserver:25.02-py3to fetch the currentsha256:<digest>
and update the README entries tonvcr.io/nvidia/tritonserver@sha256:<digest>,
and add a "Security Verification" section (or extend it) with thetrivy image nvcr.io/nvidia/tritonserver@sha256:<digest>scan command and guidance to review
CVE reports before bumping versions so the file no longer references mutable
tags.
Duplicate comments:
In@tests/model_serving/model_runtime/triton/README.md:
- Around line 46-52: The README instructs checking out a non-existent git tag
("git checkout r26.02") in the tensorflow_backend repo; replace this with the
correct tag or branch that matches your Triton version (or instruct users to run
"git fetch --tags" and choose the matching release) so the tensorflow_backend
clone can build; update the line referencing git checkout r26.02 to use the
proper tag/branch name (or advise using the repository's main/release branch)
and mention verifying the Triton version-to-backend tag mapping for the
tensorflow_backend project.- Around line 171-176: Add a language identifier (e.g., "text") to the fenced
code block that contains the error message starting with
E0326...model_lifecycle.cc:654 and also update the other unfenced code blocks
referenced (the blocks around lines with content at 88, 98, 106, 118, and 175)
so they use a fenced code block with a language tag (usetext) to satisfy MD040; ensure each opening fence istext and the corresponding closing fence
remains ``` so rendering and linting are fixed.</details> <details> <summary>🪄 Autofix (Beta)</summary> Fix all unresolved CodeRabbit comments on this PR: - [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended) - [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes </details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: Repository YAML (base), Central YAML (inherited), Organization UI (inherited) **Review profile**: CHILL **Plan**: Pro **Run ID**: `2369638e-b0c5-4991-a849-0ab615f06d96` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between e02c26f2913c47d0b3c5e94334daab1b52265519 and 7ee543b95bfa0adeac58068b9645725a12968f58. </details> <details> <summary>📒 Files selected for processing (1)</summary> * `tests/model_serving/model_runtime/triton/README.md` </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
|
|
||
| ### Current Default Version | ||
|
|
||
| - **Image**: `nvcr.io/nvidia/tritonserver:25.02-py3` |
There was a problem hiding this comment.
Pin container images by digest and document vulnerability scanning (CWE-1104).
Mutable tags can be overwritten, enabling supply chain attacks. NGC containers support digest references for immutable verification.
🔒 Recommended approach
- Fetch the digest for the current tag:
docker manifest inspect nvcr.io/nvidia/tritonserver:25.02-py3 | jq -r '.manifests[0].digest'- Update references to use digest:
-**Image**: `nvcr.io/nvidia/tritonserver:25.02-py3`
+**Image**: `nvcr.io/nvidia/tritonserver:25.02-py3@sha256:<digest>`- Document vulnerability scanning in maintenance notes:
## Security Verification
- Scan images with `trivy image nvcr.io/nvidia/tritonserver@sha256:<digest>`
- Review CVE reports before updating default versionAlso applies to: 105-105, 117-117
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/model_serving/model_runtime/triton/README.md` at line 25, Replace the
mutable image tag `nvcr.io/nvidia/tritonserver:25.02-py3` with an immutable
digest-pinned reference and add maintenance notes documenting vulnerability
scanning; run `docker manifest inspect nvcr.io/nvidia/tritonserver:25.02-py3` to
fetch the current `sha256:<digest>` and update the README entries to
`nvcr.io/nvidia/tritonserver@sha256:<digest>`, and add a "Security Verification"
section (or extend it) with the `trivy image
nvcr.io/nvidia/tritonserver@sha256:<digest>` scan command and guidance to review
CVE reports before bumping versions so the file no longer references mutable
tags.
dbasunag
left a comment
There was a problem hiding this comment.
Please address the dco and pre-commit failures. PRs can't be merged without those passing.
Addresses : https://redhat.atlassian.net/browse/RHOAIENG-52564
Updates the default Triton Inference Server image from 24.10-py3 to 25.02-py3 to restore TensorFlow and Keras model serving functionality.
Problem:
Triton 26.x removed the TensorFlow backend, causing model loading failures with error: 'unable to find backend library for backend tensorflow'
Root Cause:
NVIDIA deprecated the TensorFlow backend in Triton 25.03 and removed it completely in 26.x. The backend library is physically absent from the container.
Solution:
Use Triton 25.02, the last official release with TensorFlow backend included.
Changes:
References:
Summary by CodeRabbit
Documentation
Chores