chore(automl,autorag): refresh embedded pipeline YAMLs from upstream by chrjones-rh · Pull Request #7307 · opendatahub-io/odh-dashboard

chrjones-rh · 2026-04-17T20:07:30Z

https://issues.redhat.com/browse/RHOAIENG-58435

Description

Update compiled pipeline YAMLs from red-hat-data-services/pipelines-components rhoai-3.4 branch, matching pipelines-components#5.

This resolves a blocking issue with AutoML and AutoRAG run execution on disconnected (air-gapped) clusters where the embedded pipeline definitions referenced container images that were not available in the mirrored registry.

Files updated

packages/automl/bff/internal/pipelines/autogluon_tabular_training_pipeline/pipeline.yaml
packages/automl/bff/internal/pipelines/autogluon_timeseries_training_pipeline/pipeline.yaml
packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml

How Has This Been Tested?

Verified all three YAMLs match the upstream PR merge commit (a71ed55) byte-for-byte
No code changes — YAML-only update

Test Impact

No tests added — pipeline YAML content is validated at runtime by the Kubeflow Pipelines server.

Request review criteria:

Self checklist (all need to be checked):

The developer has manually tested the changes and verified that the changes work
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has added tests or explained why testing cannot be added
The code follows our Best Practices

If you have UI changes:

Included any necessary screenshots or gifs if it was a UI change.
Included tags to the UX team if it was a UI/UX change.

N/A -- YAML-only change, no UI impact.

After the PR is posted & before it merges:

The developer has tested their solution on a cluster by using the image produced by the PR to main

Summary by CodeRabbit

Chores
- Updated container image versions for AutoGluon tabular training, AutoGluon timeseries training, and RAG optimization pipelines.
- Refreshed embedded pipeline component archives for improved stability and performance.

Update compiled pipeline YAMLs from red-hat-data-services/pipelines-components rhoai-3.4 branch (matching pipelines-components#5). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-17T20:07:49Z

📝 Walkthrough

Walkthrough

Three KFP pipeline YAML files were modified to replace embedded KFP component archives (the __KFP_EMBEDDED_ARCHIVE_B64 base64 payloads) and to update container image digests for multiple executors. Files changed: autogluon tabular training pipeline, autogluon timeseries training pipeline, and autorag RAG optimization pipeline. No pipeline wiring, task definitions, parameters, resource settings, or public API signatures were altered.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Security and verification notes

Verify embedded archive integrity: decode each __KFP_EMBEDDED_ARCHIVE_B64 payload and inspect contents for unexpected binaries, scripts, or credential material. Check hashes and provenance. (Relevant: CWE-494 — Download of Code Without Integrity Check.)
Validate image digests and provenance: confirm each updated registry digest maps to an intended, signed image release and review registry metadata and image manifests. Scan images for known vulnerabilities before deployment.
Ensure cryptographic verification is present: there are no signatures or attestations in the diff; add or verify image/component signature checks (e.g., Notary/TUF/COSIGN) to prevent tampering. (Relevant: CWE-347 — Improper Verification of Cryptographic Signature.)
CI/source control audit: ensure the archive regeneration step is reproducible and logged in CI so changes to embedded payloads are auditable (supply-chain control point).
Actionable remediation steps:
- Decode and review each embedded archive, validate file list and checksums.
- Run container image vulnerability scans and record results; block images with critical/high CVEs.
- Implement or verify use of signed images and signed component archives; enforce verification at runtime.
- Add CI checks to prevent accidental embedding of secrets or unexpected binaries.

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: updating embedded pipeline YAMLs from upstream, affecting automl and autorag packages.
Description check	✅ Passed	The description includes issue reference, detailed explanation of changes, testing verification, and completed self-checklist addressing the template requirements comprehensively.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

nickmazzi · 2026-04-17T20:27:51Z

/lgtm
/approve

nickmazzi · 2026-04-17T20:39:12Z

/approve cancel

chrjones-rh · 2026-04-17T20:40:02Z

Switching to draft until we have successful run results for all pipeilnes.

codecov · 2026-04-21T20:40:02Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.91%. Comparing base (1ce3f26) to head (08c83ff).
⚠️ Report is 11 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7307      +/-   ##
==========================================
- Coverage   65.04%   63.91%   -1.14%     
==========================================
  Files        2458     2513      +55     
  Lines       76354    77939    +1585     
  Branches    19257    19818     +561     
==========================================
+ Hits        49668    49812     +144     
- Misses      26686    28127    +1441

see 80 files with indirect coverage changes

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b87f1ee...08c83ff. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml (1)
1029-1040: ⚠️ Potential issue | 🟠 Major

CWE-22: Unfiltered tarfile.extractall() on embedded archives without filter= argument enables path traversal and symlink attacks.

The embedded base64 tarballs in these KFP pipeline components are decoded and extracted via __kfp_tar.extractall(path=__KFP_EMBEDDED_ASSET_DIR) with no filter= parameter. Per PEP 706, Python 3.12+ emits a DeprecationWarning and Python 3.14 will reject extraction without an explicit filter. More critically: any path-traversal, symlink, or device-file entry in the tarball (../../etc/passwd, absolute paths, /dev/*) will be honored and written outside the intended __KFP_EMBEDDED_ASSET_DIR, then prepended to sys.path — enabling arbitrary code execution at component import time.

Affects 5 locations across 3 files:

packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml:1036

packages/automl/bff/internal/pipelines/autogluon_tabular_training_pipeline/pipeline.yaml:185, 667

packages/automl/bff/internal/pipelines/autogluon_timeseries_training_pipeline/pipeline.yaml:295, 807

These files are generated from red-hat-data-services/pipelines-components; the fix must be applied upstream so the codegen emits filter='data' in the extractall() call. Track in the linked RHOAIENG ticket to ensure the next component refresh includes this hardening.
Suggested fix for upstream codegen
-        __kfp_tar.extractall(path=__KFP_EMBEDDED_ASSET_DIR)
+        __kfp_tar.extractall(path=__KFP_EMBEDDED_ASSET_DIR, filter='data')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml`
around lines 1029 - 1040, The extractall call on the embedded archive is unsafe
(uses __kfp_tar.extractall(path=__KFP_EMBEDDED_ASSET_DIR)) and must be replaced
with a safe extraction (either pass an explicit filter= callable per PEP 706 or
emit a safe extraction helper) that: rejects absolute paths and any member with
path components like '..', rejects symlinks and device files, and only allows
extraction into __KFP_EMBEDDED_ASSET_DIR; update the codegen that writes
extraction logic for symbols __KFP_EMBEDDED_ARCHIVE_B64, __kfp_tar, and
__KFP_EMBEDDED_ASSET_DIR so generated pipeline.yaml uses the safe filter/helper
instead of bare extractall.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml`:
- Line 388: The pipeline.yaml entry for the image (the line containing
registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:b51e1c7b2b4b857f4f5ea34654b10326196fdd1a0487012a9f7074ef092a63c5)
no longer matches upstream commit a71ed55 and the pinned digest cannot be found
in the Red Hat catalog; fix by (1) fetching the upstream file at commit a71ed55
and producing a git diff against our pipeline.yaml to show the exact drift, (2)
querying the Red Hat catalog/API for the expected image digest and replacing the
current sha256 value with the verified digest (or revert the entire image line
to the upstream value from a71ed55), and (3) include the diff output in your PR
description and add a short note in the commit message referencing the
verification step and catalog query used.

---

Outside diff comments:
In
`@packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml`:
- Around line 1029-1040: The extractall call on the embedded archive is unsafe
(uses __kfp_tar.extractall(path=__KFP_EMBEDDED_ASSET_DIR)) and must be replaced
with a safe extraction (either pass an explicit filter= callable per PEP 706 or
emit a safe extraction helper) that: rejects absolute paths and any member with
path components like '..', rejects symlinks and device files, and only allows
extraction into __KFP_EMBEDDED_ASSET_DIR; update the codegen that writes
extraction logic for symbols __KFP_EMBEDDED_ARCHIVE_B64, __kfp_tar, and
__KFP_EMBEDDED_ASSET_DIR so generated pipeline.yaml uses the safe filter/helper
instead of bare extractall.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro Plus

Run ID: 539e9146-73a1-45b1-951f-f2fdce61807d

📥 Commits

Reviewing files that changed from the base of the PR and between b7a499a and 08c83ff.

📒 Files selected for processing (3)

packages/automl/bff/internal/pipelines/autogluon_tabular_training_pipeline/pipeline.yaml
packages/automl/bff/internal/pipelines/autogluon_timeseries_training_pipeline/pipeline.yaml
packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml

coderabbitai · 2026-04-22T19:40:33Z

          \ as f:\n        json.dump(descriptor, f, indent=2)\n\n    logger.info(\"\
          Documents descriptor written to %s\", descriptor_path)\n\n"
-        image: registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:152c98afbf558367807d9dc0bd0a203a71936476039c7130eeecc079d0605cf2
+        image: registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:b51e1c7b2b4b857f4f5ea34654b10326196fdd1a0487012a9f7074ef092a63c5


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # 1. Confirm all 7 executor images in this file resolve to the same digest # (cheap local consistency check). fd -t f 'pipeline.yaml' packages/autorag packages/automl --exec \ sh -c 'echo "=== {} ==="; grep -nE "odh-autorag-rhel9@sha256:|odh-autogluon" "{}" | sort -u' # 2. Diff checked-in file against upstream a71ed55 to validate byte-for-byte claim. REL="autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml" curl -fsSL "https://raw.githubusercontent.com/red-hat-data-services/pipelines-components/a71ed55/${REL}" -o /tmp/upstream.yaml \ && diff -u /tmp/upstream.yaml "packages/${REL}" && echo "MATCH" || echo "DRIFT" # 3. Verify the new digest is actually published for the tag the upstream pins. curl -fsSL "https://catalog.redhat.com/api/containers/v1/repositories/registry/registry.redhat.io/repository/rhoai/odh-autorag-rhel9/images?filter=repositories.manifest_schema2_digest==sha256:b51e1c7b2b4b857f4f5ea34654b10326196fdd1a0487012a9f7074ef092a63c5" \ | jq '.data[0] | {tag: .repositories[0].tags[0].name, published: .creation_date, vulnerabilities: .certified}'

Repository: opendatahub-io/odh-dashboard

Length of output: 1446

Byte-for-byte match with upstream a71ed55 contradicted; image digest unverified in Red Hat catalog.

The PR claims this file mirrors red-hat-data-services/pipelines-components @ a71ed55 byte-for-byte, but diff against upstream shows DRIFT—the checked-in file does not match. Additionally, the pinned digest sha256:b51e1c7b2b4b857f4f5ea34654b10326196fdd1a0487012a9f7074ef092a63c5 cannot be verified in the Red Hat container catalog (404 on catalog API query).

Production pods will be pinned to an image that: (1) diverges from the claimed upstream version, (2) cannot be verified as legitimately published by Red Hat. This creates CWE-349 (untrusted image tag reuse) and CWE-295 (use of verify=False in fallback SSL paths) risk. Confirm the exact drift against a71ed55, verify the image digest is actually published, and provide the diff output showing what changed.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml` at line 388, The pipeline.yaml entry for the image (the line containing registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:b51e1c7b2b4b857f4f5ea34654b10326196fdd1a0487012a9f7074ef092a63c5) no longer matches upstream commit a71ed55 and the pinned digest cannot be found in the Red Hat catalog; fix by (1) fetching the upstream file at commit a71ed55 and producing a git diff against our pipeline.yaml to show the exact drift, (2) querying the Red Hat catalog/API for the expected image digest and replacing the current sha256 value with the verified digest (or revert the entire image line to the upstream value from a71ed55), and (3) include the diff output in your PR description and add a short note in the commit message referencing the verification step and catalog query used.

chrjones-rh · 2026-04-22T19:56:52Z

RAG run:

ML tabular run:

ML timeseries run:

jefho-rh · 2026-04-22T20:31:54Z

Thanks @chrjones-rh, every flow works well on my end on a connected cluster

✅ AutoML Binary Passing

✅ AutoML Multiclass Passing

✅ AutoML TimeSeries Passing

✅ AutoML Regression Passing

✅ AutoRAG Passing

/lgtm

GAUNSD · 2026-04-22T20:36:33Z

/approve

openshift-ci · 2026-04-22T20:36:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: GAUNSD

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [GAUNSD]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…7307) (#7363) * chore(automl,autorag): refresh embedded pipeline YAMLs from upstream Update compiled pipeline YAMLs from red-hat-data-services/pipelines-components rhoai-3.4 branch (matching pipelines-components#5). * chore(automl,autorag): refresh embedded pipeline YAMLs from upstream --------- Co-authored-by: Christopher Jones <chrjones@redhat.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…pendatahub-io#7307) (opendatahub-io#7363) (#1801) * chore(automl,autorag): refresh embedded pipeline YAMLs from upstream Update compiled pipeline YAMLs from red-hat-data-services/pipelines-components rhoai-3.4 branch (matching pipelines-components#5). * chore(automl,autorag): refresh embedded pipeline YAMLs from upstream --------- Co-authored-by: Christopher Jones <chrjones@redhat.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore(automl,autorag): refresh embedded pipeline YAMLs from upstream

b7a499a

Update compiled pipeline YAMLs from red-hat-data-services/pipelines-components rhoai-3.4 branch (matching pipelines-components#5). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

openshift-ci Bot requested review from MatthewAThompson and NickGagan April 17, 2026 20:07

openshift-ci Bot added area/automl area/autorag area/components labels Apr 17, 2026

openshift-ci Bot assigned nickmazzi Apr 17, 2026

openshift-ci Bot added lgtm approved labels Apr 17, 2026

openshift-ci Bot removed the approved label Apr 17, 2026

chrjones-rh marked this pull request as draft April 17, 2026 20:39

openshift-ci Bot added the do-not-merge/work-in-progress This PR is in WIP state label Apr 17, 2026

chrjones-rh mentioned this pull request Apr 17, 2026

fix(automl,autorag): extend dynamic port-forwarding to S3 and LlamaStack paths #7310

Merged

7 tasks

Merge branch 'main' into RHOAIENG-58435-UI

1ff04e9

openshift-ci Bot removed the lgtm label Apr 21, 2026

chore(automl,autorag): refresh embedded pipeline YAMLs from upstream

08c83ff

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chrjones-rh requested review from jefho-rh and nickmazzi and removed request for MatthewAThompson and NickGagan April 22, 2026 19:28

chrjones-rh marked this pull request as ready for review April 22, 2026 19:29

openshift-ci Bot removed the do-not-merge/work-in-progress This PR is in WIP state label Apr 22, 2026

openshift-ci Bot requested review from GAUNSD and MatthewAThompson April 22, 2026 19:29

coderabbitai Bot reviewed Apr 22, 2026

View reviewed changes

openshift-ci Bot assigned jefho-rh Apr 22, 2026

openshift-ci Bot added the lgtm label Apr 22, 2026

openshift-ci Bot added the approved label Apr 22, 2026

openshift-merge-bot Bot merged commit 9677baa into opendatahub-io:main Apr 22, 2026
58 checks passed

nickmazzi mentioned this pull request Apr 22, 2026

chore(automl,autorag): refresh embedded pipeline YAMLs from upstream #7363

Merged

Conversation

chrjones-rh commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Files updated

How Has This Been Tested?

Test Impact

Request review criteria:

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Security and verification notes

Uh oh!

nickmazzi commented Apr 17, 2026

Uh oh!

nickmazzi commented Apr 17, 2026

Uh oh!

chrjones-rh commented Apr 17, 2026

Uh oh!

codecov Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

chrjones-rh commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jefho-rh commented Apr 22, 2026

Uh oh!

GAUNSD commented Apr 22, 2026

Uh oh!

openshift-ci Bot commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chrjones-rh commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading

codecov Bot commented Apr 21, 2026 •

edited

Loading

chrjones-rh commented Apr 22, 2026 •

edited

Loading