Skip to content

fix(triton): Update default image to 25.02 for TensorFlow backend sup…#1327

Open
rpancham wants to merge 2 commits intoopendatahub-io:mainfrom
rpancham:bugfix/triton-26-tensorflow-backend-missing
Open

fix(triton): Update default image to 25.02 for TensorFlow backend sup…#1327
rpancham wants to merge 2 commits intoopendatahub-io:mainfrom
rpancham:bugfix/triton-26-tensorflow-backend-missing

Conversation

@rpancham
Copy link
Copy Markdown
Contributor

@rpancham rpancham commented Mar 31, 2026

Addresses : https://redhat.atlassian.net/browse/RHOAIENG-52564

Updates the default Triton Inference Server image from 24.10-py3 to 25.02-py3 to restore TensorFlow and Keras model serving functionality.

Problem:
Triton 26.x removed the TensorFlow backend, causing model loading failures with error: 'unable to find backend library for backend tensorflow'

Root Cause:
NVIDIA deprecated the TensorFlow backend in Triton 25.03 and removed it completely in 26.x. The backend library is physically absent from the container.

Solution:
Use Triton 25.02, the last official release with TensorFlow backend included.

Changes:

  • Updated TRITON_IMAGE constant from 24.10-py3 to 25.02-py3
  • Added comprehensive README documenting version compatibility
  • Documented migration paths for users needing Triton 26.x

References:

Summary by CodeRabbit

  • Documentation

    • Added comprehensive Triton Inference Server test-suite guide covering supported frameworks/backends, validation scope (deployment & inference), test execution examples, default runtime/configuration, infrastructure requirements, troubleshooting tips, and guidance for TensorFlow/Keras compatibility and migration paths.
  • Chores

    • Updated default Triton Inference Server container image to the newer 25.02 version.

…port

Updates the default Triton Inference Server image from 24.10-py3 to 25.02-py3
to restore TensorFlow and Keras model serving functionality.

Problem:
Triton 26.x removed the TensorFlow backend, causing model loading failures
with error: 'unable to find backend library for backend tensorflow'

Root Cause:
NVIDIA deprecated the TensorFlow backend in Triton 25.03 and removed it
completely in 26.x. The backend library is physically absent from the container.

Solution:
Use Triton 25.02, the last official release with TensorFlow backend included.

Changes:
- Updated TRITON_IMAGE constant from 24.10-py3 to 25.02-py3
- Added comprehensive README documenting version compatibility
- Documented migration paths for users needing Triton 26.x

Testing:
Integration tests required in CI/CD to verify all 7 model types deploy
successfully with the updated image.

References:
- NVIDIA Triton Release Notes: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/
- TensorFlow Backend Deprecation: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel-25-03.html
@github-actions
Copy link
Copy Markdown

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/hold', '/cherry-pick', '/build-push-pr-image', '/lgtm', '/verified', '/wip'}

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

📝 Walkthrough

Walkthrough

Added a new README documenting the Triton Inference Server runtime tests (scope, supported frameworks/backends, execution commands, config assumptions, infra requirements, troubleshooting, maintenance). Also bumped the default Triton container image constant from nvcr.io/nvidia/tritonserver:24.10-py3 to nvcr.io/nvidia/tritonserver:25.02-py3. Security note: validate and scan the upgraded image for known vulnerabilities (CWE-1104: Use of Unmaintained Third-Party Components) and prefer pinned digests.

Changes

Cohort / File(s) Summary
Triton Test Documentation
tests/model_serving/model_runtime/triton/README.md
New documentation describing test scope, supported model frameworks/backends, default Triton image/version, pytest examples, runtime configuration assumptions (model store path, ports, resource limits), infra/permission needs (Kubernetes/OpenShift, GPU for DALI, S3), test structure (fixtures, test cases, JSON inputs), troubleshooting snippets (TensorFlow backend missing, model-loading/GPU failures), and maintenance/update instructions.
Version Constants
tests/model_serving/model_runtime/triton/constant.py
Updated TRITON_IMAGE constant from nvcr.io/nvidia/tritonserver:24.10-py3nvcr.io/nvidia/tritonserver:25.02-py3; added inline comments about backend deprecation/removal timeline.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: updating the Triton image to 25.02 for TensorFlow backend support, which is the primary objective of the PR.
Description check ✅ Passed The description provides comprehensive context including JIRA reference, problem statement, root cause analysis, and solution details. However, it does not include the required testing checklist sections from the template.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/model_serving/model_runtime/triton/constant.py`:
- Line 37: The TRITON_IMAGE constant uses a mutable tag (TRITON_IMAGE) which
risks supply-chain drift; replace the tag
"nvcr.io/nvidia/tritonserver:25.02-py3" with the corresponding immutable image
reference pinned by digest (sha256) fetched from NVIDIA NGC release artifacts
(e.g., locate the official digest for 25.02-py3 on NGC docs or the image
registry and use "nvcr.io/nvidia/tritonserver@sha256:<DIGEST>"); update the
TRITON_IMAGE constant to that digest-pinned string and add a brief comment
referencing the NGC release page used to obtain the digest to ensure CI
reproducibility.

In `@tests/model_serving/model_runtime/triton/README.md`:
- Around line 23-176: The README has markdownlint MD022/MD031 violations due to
missing blank lines around headings and fenced code blocks (e.g., headings
"Current Default Version", "TensorFlow Backend Availability", "Using Triton 26.x
with TensorFlow Models", the fenced bash/python/pytest blocks, and the "Default
Image" code fence showing TRITON_IMAGE). Fix by ensuring there is exactly one
blank line before and after each top-level heading and before/after every fenced
code block (```bash, ```python, ```), and add blank lines surrounding the table
sections as needed so all headings and code fences are separated by blank lines
consistent with markdownlint rules.
- Around line 155-159: Update the fenced code block in the README.md that shows
the TensorFlow backend error so it declares a language identifier (e.g., change
the opening ``` to ```text) — locate the block containing the "E0326
10:10:41.293702 1 model_lifecycle.cc:654" / "unable to find backend library for
backend 'tensorflow'" error and add the language token to the opening fence to
satisfy markdownlint MD040 and improve rendering.
- Around line 46-49: The README currently instructs to run "git checkout r26.02"
against the tensorflow_backend repo which fails because that tag doesn't exist;
update the instructions to instead reference the official NGC container or the
build helper: advise using the prebuilt image
"nvcr.io/nvidia/tritonserver:26.02-py3" for reproducible builds, or if building
from source instruct users to run the repository's build.py with the repo tag
flag (e.g., "--repo-tag=backend:r26.02") so CMake FetchContent pulls the correct
backend commit rather than doing a direct "git checkout r26.02".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: bed5f521-755e-4bd5-8f9a-f608f0cb9a8f

📥 Commits

Reviewing files that changed from the base of the PR and between 727e0f5 and e02c26f.

📒 Files selected for processing (2)
  • tests/model_serving/model_runtime/triton/README.md
  • tests/model_serving/model_runtime/triton/constant.py

# Updated to 25.02 - last Triton release with TensorFlow backend included by default
# TensorFlow backend was deprecated in 25.03 and removed in 26.x+
# See: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/
TRITON_IMAGE: str = "nvcr.io/nvidia/tritonserver:25.02-py3"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pin Triton image by digest to prevent mutable-tag supply-chain drift (CWE-494).

Using :25.02-py3 alone allows upstream retagging, which can silently change tested binaries and invalidate reproducibility/security assumptions.

Suggested hardening
-TRITON_IMAGE: str = "nvcr.io/nvidia/tritonserver:25.02-py3"
+# Keep human-readable tag in comment, but enforce immutable digest.
+# Example digest shown below; replace with the verified 25.02-py3 digest from NGC.
+TRITON_IMAGE: str = "nvcr.io/nvidia/tritonserver@sha256:<verified-25.02-py3-digest>"
Find the official immutable digest for `nvcr.io/nvidia/tritonserver:25.02-py3` on NVIDIA NGC docs/release artifacts, and confirm digest pinning guidance for CI reproducibility.

As per coding guidelines, "**: REVIEW PRIORITIES: 1. Security vulnerabilities (provide severity, exploit scenario, and remediation code)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/model_serving/model_runtime/triton/constant.py` at line 37, The
TRITON_IMAGE constant uses a mutable tag (TRITON_IMAGE) which risks supply-chain
drift; replace the tag "nvcr.io/nvidia/tritonserver:25.02-py3" with the
corresponding immutable image reference pinned by digest (sha256) fetched from
NVIDIA NGC release artifacts (e.g., locate the official digest for 25.02-py3 on
NGC docs or the image registry and use
"nvcr.io/nvidia/tritonserver@sha256:<DIGEST>"); update the TRITON_IMAGE constant
to that digest-pinned string and add a brief comment referencing the NGC release
page used to obtain the digest to ensure CI reproducibility.

Comment on lines +46 to +49
git clone https://github.com/triton-inference-server/tensorflow_backend
cd tensorflow_backend
git checkout r26.02 # Match your Triton version
# Follow build instructions in repository README
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does the triton-inference-server/tensorflow_backend project provide immutable commit references or signed tags for the r26.02 release, and what is the recommended secure checkout method?

💡 Result:

No, the triton-inference-server/tensorflow_backend project does not provide immutable commit references or signed tags specifically for the r26.02 release. The GitHub repository has no published releases, and searches did not identify a branch or tag named "r26.02". The main Triton server repository uses branches like r26.02 (e.g., git clone -b r26.02 https://github.com/triton-inference-server/server.git), but the tensorflow_backend repo defaults to "main" during builds without a specific r26.02 reference. The recommended secure checkout method for Triton components, including backends, is to use the official NGC Docker containers (e.g., nvcr.io/nvidia/tritonserver:26.02-py3), which include pre-built, verified backends corresponding to the release. For source builds, use the build.py script with --repo-tag=backend:r26.02 (or equivalent), which fetches the appropriate commit/branch from the backend repos during the CMake FetchContent process. Verify container integrity using NGC's signature verification tools. No explicit mention of GPG-signed tags or commits for backends was found; security relies on NGC containers and GitHub's integrity.

Citations:


Use NGC Docker containers or build.py instead of manual git checkout—r26.02 tag does not exist in tensorflow_backend repository.

The documented git checkout r26.02 will fail because that tag/branch is not present in triton-inference-server/tensorflow_backend. For reproducible, verified builds (CWE-494), use the official NGC pre-built container (nvcr.io/nvidia/tritonserver:26.02-py3). If building from source is required, use the build.py script with --repo-tag=backend:r26.02, which fetches the correct backend commit via CMake FetchContent. Direct git checkout is neither secure nor functional for this repository.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/model_serving/model_runtime/triton/README.md` around lines 46 - 49, The
README currently instructs to run "git checkout r26.02" against the
tensorflow_backend repo which fails because that tag doesn't exist; update the
instructions to instead reference the official NGC container or the build
helper: advise using the prebuilt image "nvcr.io/nvidia/tritonserver:26.02-py3"
for reproducible builds, or if building from source instruct users to run the
repository's build.py with the repo tag flag (e.g., "--repo-tag=backend:r26.02")
so CMake FetchContent pulls the correct backend commit rather than doing a
direct "git checkout r26.02".

Comment on lines +155 to +159
```
E0326 10:10:41.293702 1 model_lifecycle.cc:654] "failed to load 'model_name' version 1:
Invalid argument: unable to find backend library for backend 'tensorflow',
try specifying runtime on the model configuration."
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the troubleshooting code fence (MD040).

The fenced block for the TensorFlow backend error should declare a language (e.g., text) to satisfy markdownlint and improve rendering.

Suggested fix
-```
+```text
 E0326 10:10:41.293702 1 model_lifecycle.cc:654] "failed to load 'model_name' version 1:
 Invalid argument: unable to find backend library for backend 'tensorflow',
 try specifying runtime on the model configuration."
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion

🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 155-155: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 155-155: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/model_serving/model_runtime/triton/README.md` around lines 155 - 159,
Update the fenced code block in the README.md that shows the TensorFlow backend
error so it declares a language identifier (e.g., change the opening ``` to
```text) — locate the block containing the "E0326 10:10:41.293702 1
model_lifecycle.cc:654" / "unable to find backend library for backend
'tensorflow'" error and add the language token to the opening fence to satisfy
markdownlint MD040 and improve rendering.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
tests/model_serving/model_runtime/triton/README.md (2)

46-52: ⚠️ Potential issue | 🔴 Critical

Non-existent git tag will cause build failure (duplicate).

Line 50 instructs git checkout r26.02, which does not exist in the tensorflow_backend repository. This issue was already flagged in previous review comments (lines 46-49).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/model_serving/model_runtime/triton/README.md` around lines 46 - 52, The
README instructs checking out a non-existent git tag ("git checkout r26.02") in
the tensorflow_backend repo; replace this with the correct tag or branch that
matches your Triton version (or instruct users to run "git fetch --tags" and
choose the matching release) so the tensorflow_backend clone can build; update
the line referencing git checkout r26.02 to use the proper tag/branch name (or
advise using the repository's main/release branch) and mention verifying the
Triton version-to-backend tag mapping for the tensorflow_backend project.

171-176: ⚠️ Potential issue | 🟡 Minor

Add language identifier to code fence at line 171 (MD040).

The code block is missing a language identifier. There are additional unfenced code blocks in this file at lines 88, 98, 106, 118, and 175.

Fix
-```
+```text
 E0326 10:10:41.293702 1 model_lifecycle.cc:654] "failed to load 'model_name' version 1:
 Invalid argument: unable to find backend library for backend 'tensorflow',
 try specifying runtime on the model configuration."

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @tests/model_serving/model_runtime/triton/README.md around lines 171 - 176,
Add a language identifier (e.g., "text") to the fenced code block that contains
the error message starting with E0326...model_lifecycle.cc:654 and also update
the other unfenced code blocks referenced (the blocks around lines with content
at 88, 98, 106, 118, and 175) so they use a fenced code block with a language
tag (use text) to satisfy MD040; ensure each opening fence is text and the
corresponding closing fence remains ``` so rendering and linting are fixed.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @tests/model_serving/model_runtime/triton/README.md:

  • Line 25: Replace the mutable image tag nvcr.io/nvidia/tritonserver:25.02-py3
    with an immutable digest-pinned reference and add maintenance notes documenting
    vulnerability scanning; run docker manifest inspect nvcr.io/nvidia/tritonserver:25.02-py3 to fetch the current sha256:<digest>
    and update the README entries to nvcr.io/nvidia/tritonserver@sha256:<digest>,
    and add a "Security Verification" section (or extend it) with the trivy image nvcr.io/nvidia/tritonserver@sha256:<digest> scan command and guidance to review
    CVE reports before bumping versions so the file no longer references mutable
    tags.

Duplicate comments:
In @tests/model_serving/model_runtime/triton/README.md:

  • Around line 46-52: The README instructs checking out a non-existent git tag
    ("git checkout r26.02") in the tensorflow_backend repo; replace this with the
    correct tag or branch that matches your Triton version (or instruct users to run
    "git fetch --tags" and choose the matching release) so the tensorflow_backend
    clone can build; update the line referencing git checkout r26.02 to use the
    proper tag/branch name (or advise using the repository's main/release branch)
    and mention verifying the Triton version-to-backend tag mapping for the
    tensorflow_backend project.
  • Around line 171-176: Add a language identifier (e.g., "text") to the fenced
    code block that contains the error message starting with
    E0326...model_lifecycle.cc:654 and also update the other unfenced code blocks
    referenced (the blocks around lines with content at 88, 98, 106, 118, and 175)
    so they use a fenced code block with a language tag (use text) to satisfy MD040; ensure each opening fence is text and the corresponding closing fence
    remains ``` so rendering and linting are fixed.

</details>

<details>
<summary>🪄 Autofix (Beta)</summary>

Fix all unresolved CodeRabbit comments on this PR:

- [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended)
- [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `2369638e-b0c5-4991-a849-0ab615f06d96`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between e02c26f2913c47d0b3c5e94334daab1b52265519 and 7ee543b95bfa0adeac58068b9645725a12968f58.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `tests/model_serving/model_runtime/triton/README.md`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->


### Current Default Version

- **Image**: `nvcr.io/nvidia/tritonserver:25.02-py3`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pin container images by digest and document vulnerability scanning (CWE-1104).

Mutable tags can be overwritten, enabling supply chain attacks. NGC containers support digest references for immutable verification.

🔒 Recommended approach
  1. Fetch the digest for the current tag:
docker manifest inspect nvcr.io/nvidia/tritonserver:25.02-py3 | jq -r '.manifests[0].digest'
  1. Update references to use digest:
-**Image**: `nvcr.io/nvidia/tritonserver:25.02-py3`
+**Image**: `nvcr.io/nvidia/tritonserver:25.02-py3@sha256:<digest>`
  1. Document vulnerability scanning in maintenance notes:
## Security Verification
- Scan images with `trivy image nvcr.io/nvidia/tritonserver@sha256:<digest>`
- Review CVE reports before updating default version

Also applies to: 105-105, 117-117

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/model_serving/model_runtime/triton/README.md` at line 25, Replace the
mutable image tag `nvcr.io/nvidia/tritonserver:25.02-py3` with an immutable
digest-pinned reference and add maintenance notes documenting vulnerability
scanning; run `docker manifest inspect nvcr.io/nvidia/tritonserver:25.02-py3` to
fetch the current `sha256:<digest>` and update the README entries to
`nvcr.io/nvidia/tritonserver@sha256:<digest>`, and add a "Security Verification"
section (or extend it) with the `trivy image
nvcr.io/nvidia/tritonserver@sha256:<digest>` scan command and guidance to review
CVE reports before bumping versions so the file no longer references mutable
tags.

Copy link
Copy Markdown
Collaborator

@dbasunag dbasunag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the dco and pre-commit failures. PRs can't be merged without those passing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants