chore(automl,autorag): refresh embedded pipeline YAMLs from upstream#7244
Conversation
Update compiled pipeline definitions from red-hat-data-services/pipelines-components rhoai-3.4 branch. Container images moved from quay.io to registry.redhat.io with updated digests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughContainer image references and embedded Kubeflow component archive payloads are updated across three pipeline configuration files. The changes replace image digests from Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Security & Actionable IssuesImage Registry & Digest Verification Required:
Embedded Archive Payload Integrity:
No public entity signature modifications detected, but manifest-level changes warrant validation of container runtime policies and signed image enforcement in your deployment cluster. 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: GAUNSD The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml (1)
1029-1037:⚠️ Potential issue | 🟠 MajorAdd member path validation to archive extraction (CWE-22 — Path Traversal)
The embedded tar archive extraction at line 1036 uses
extractall()without validating member paths. If the archive contains entries with relative parent paths (e.g.,../../../etc/passwd), they will be extracted outside the temp directory. Supply chain compromise of the embedded payload enables arbitrary file write.Hardening patch
import tarfile as __kfp_tarfile import tempfile as __kfp_tempfile +from pathlib import Path as __kfp_Path # Extract embedded archive at import time to ensure sys.path and globals are set __kfp_tmpdir = __kfp_tempfile.TemporaryDirectory() __KFP_EMBEDDED_ASSET_DIR = __kfp_tmpdir.name try: __kfp_bytes = __kfp_b64.b64decode(__KFP_EMBEDDED_ARCHIVE_B64.encode('ascii')) with __kfp_tarfile.open(fileobj=__kfp_io.BytesIO(__kfp_bytes), mode='r:gz') as __kfp_tar: + __root = __kfp_Path(__KFP_EMBEDDED_ASSET_DIR).resolve() + for __m in __kfp_tar.getmembers(): + __target = (__root / __m.name).resolve() + if not str(__target).startswith(str(__root) + __kfp_os.sep): + raise RuntimeError(f"Unsafe archive member path: {__m.name}") __kfp_tar.extractall(path=__KFP_EMBEDDED_ASSET_DIR) except Exception as __kfp_e: raise RuntimeError(f'Failed to extract embedded archive: {__kfp_e}')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml` around lines 1029 - 1037, The tar extract uses __kfp_tar.extractall(...) to unpack __KFP_EMBEDDED_ARCHIVE_B64 into __KFP_EMBEDDED_ASSET_DIR allowing path traversal; replace extractall with a safe extraction loop: iterate over __kfp_tar.getmembers(), compute the target path by joining __KFP_EMBEDDED_ASSET_DIR and member.name, normalize it, verify the normalized path starts with the normalized __KFP_EMBEDDED_ASSET_DIR (e.g., via os.path.commonpath or commonprefix), skip or raise on any member that fails the check, and then extract/write only approved members (creating directories as needed) so no entry can escape the temp dir.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml`:
- Line 388: The pipeline references private images like
registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:... but lacks image pull
credentials; update the pipeline spec to include either imagePullSecrets
(pointing to a pre-created Secret with registry.redhat.io credentials), or set a
serviceAccountName bound to a ServiceAccount that has entitlements, or add a
podSpecPatch that injects imagePullSecrets into task/pod templates; locate the
image occurrences in pipeline.yaml (the image string
registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:...) and add the
imagePullSecrets/serviceAccountName/podSpecPatch at the corresponding
spec.template.spec (or PipelineRun/TaskRun) blocks so Kubernetes can
authenticate to the private registry.
---
Outside diff comments:
In
`@packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml`:
- Around line 1029-1037: The tar extract uses __kfp_tar.extractall(...) to
unpack __KFP_EMBEDDED_ARCHIVE_B64 into __KFP_EMBEDDED_ASSET_DIR allowing path
traversal; replace extractall with a safe extraction loop: iterate over
__kfp_tar.getmembers(), compute the target path by joining
__KFP_EMBEDDED_ASSET_DIR and member.name, normalize it, verify the normalized
path starts with the normalized __KFP_EMBEDDED_ASSET_DIR (e.g., via
os.path.commonpath or commonprefix), skip or raise on any member that fails the
check, and then extract/write only approved members (creating directories as
needed) so no entry can escape the temp dir.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)
Review profile: CHILL
Plan: Pro Plus
Run ID: bf2a94d7-8d49-4186-87af-554f085fbeef
📒 Files selected for processing (3)
packages/automl/bff/internal/pipelines/autogluon_tabular_training_pipeline/pipeline.yamlpackages/automl/bff/internal/pipelines/autogluon_timeseries_training_pipeline/pipeline.yamlpackages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml
| \ as f:\n json.dump(descriptor, f, indent=2)\n\n logger.info(\"\ | ||
| Documents descriptor written to %s\", descriptor_path)\n\n" | ||
| image: quay.io/rhoai/odh-autorag-rhel9:rhoai-3.4@sha256:06a2892228510011c9081acf73fe140e46bd13e5aad6ecb9721f902ed5418ec5 | ||
| image: registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:7883d99fa6eb94841622f2130edb121a32cc5e7a3ddc894cee5cc50b7042a4fd |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) Confirm all updated images are on registry.redhat.io"
rg -n 'image:\s*registry\.redhat\.io/rhoai/odh-autorag-rhel9@sha256:' \
packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml
echo
echo "2) Check this pipeline spec for explicit image pull-secret / SA linkage"
rg -n 'imagePullSecrets|serviceAccountName|podSpecPatch|imagePullSecret' \
packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml || true
echo
echo "3) Find ServiceAccount manifests that define imagePullSecrets in repo"
rg -n 'kind:\s*ServiceAccount|imagePullSecrets|registry\.redhat\.io' --iglob '*.{yaml,yml}'Repository: opendatahub-io/odh-dashboard
Length of output: 9486
Add explicit imagePullSecrets or serviceAccountName to pipeline specification
Lines 388, 669, 997, 1574, 1814, 1904, and 2198 reference registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:... but the pipeline.yaml contains no imagePullSecrets, serviceAccountName, or podSpecPatch directives. Private registry pulls will fail with ImagePullBackOff if the runtime service account lacks entitlements or configured pull credentials.
Configure explicit image pull credentials in the pipeline spec or bind a service account with registry.redhat.io access.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yaml`
at line 388, The pipeline references private images like
registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:... but lacks image pull
credentials; update the pipeline spec to include either imagePullSecrets
(pointing to a pre-created Secret with registry.redhat.io credentials), or set a
serviceAccountName bound to a ServiceAccount that has entitlements, or add a
podSpecPatch that injects imagePullSecrets into task/pod templates; locate the
image occurrences in pipeline.yaml (the image string
registry.redhat.io/rhoai/odh-autorag-rhel9@sha256:...) and add the
imagePullSecrets/serviceAccountName/podSpecPatch at the corresponding
spec.template.spec (or PipelineRun/TaskRun) blocks so Kubernetes can
authenticate to the private registry.
335afea
into
opendatahub-io:main
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #7244 +/- ##
=======================================
Coverage 64.80% 64.80%
=======================================
Files 2441 2441
Lines 75996 75996
Branches 19158 19158
=======================================
+ Hits 49250 49252 +2
+ Misses 26746 26744 -2 see 7 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|


https://redhat.atlassian.net/browse/RHOAIENG-57588
Description
Refreshes embedded pipeline YAML definitions from the upstream
red-hat-data-services/pipelines-componentsrepository (branch:rhoai-3.4).Container images updated from
quay.iotoregistry.redhat.iowith new digests (standard production registry transition).Updated pipelines:
autogluon_tabular_training_pipeline/pipeline.yaml— updated image referencesautogluon_timeseries_training_pipeline/pipeline.yaml— updated image referencesdocuments_rag_optimization_pipeline/pipeline.yaml— updated image referencesFiles changed:
packages/automl/bff/internal/pipelines/autogluon_tabular_training_pipeline/pipeline.yamlpackages/automl/bff/internal/pipelines/autogluon_timeseries_training_pipeline/pipeline.yamlpackages/autorag/bff/internal/pipelines/documents_rag_optimization_pipeline/pipeline.yamlHow Has This Been Tested?
Test Impact
Request review criteria:
Self checklist (all need to be checked):
If you have UI changes:
After the PR is posted & before it merges:
mainSummary by CodeRabbit
Release Notes