refactor: simplify container build to use odh base image and single requirements.txt#390
refactor: simplify container build to use odh base image and single requirements.txt#390eoinfennessy wants to merge 3 commits into
Conversation
📝 WalkthroughWalkthroughThis PR restructures the build and distribution system by migrating from Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Security observationsCWE-434 (Unrestricted Upload of File with Dangerous Type) – CWE-426 (Untrusted Search Path) – CWE-78 (OS Command Injection) – CWE-427 (Uncontrolled Search Path Element) – 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
4dcafa5 to
d15937c
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/redhat-distro-container.yml:
- Around line 105-109: The workflow currently uses "uv run --with ruamel.yaml
distribution/build.py", which bypasses the lockfile; change it to first run "uv
sync --locked" to install pinned deps from uv.lock and then execute the build
inside that locked environment (invoke distribution/build.py via uv run without
the unpinned "--with" flag); ensure OGX_VERSION env remains set and both
commands run in the same job step so the deterministic locked environment is
used for running distribution/build.py.
In `@distribution/build.py`:
- Around line 115-133: Wrap the two subprocess.run invocations that create the
venv and run the uv pip install (the calls that use venv_path and cmd along with
packages and optional constraints) in try/except for
subprocess.CalledProcessError; on exception, include the failing command,
e.returncode, e.stdout and e.stderr in the raised/logged message before
re-raising (or raise a new RuntimeError with that context) so CI/pre-commit sees
actionable diagnostics rather than an opaque CalledProcessError.
In `@distribution/Containerfile`:
- Line 3: Update the UV_INDEX_STRATEGY environment variable in the
Containerfile: locate the ENV UV_INDEX_STRATEGY declaration and change its value
from "unsafe-best-match" to "first-index" so the package resolver stops at the
first configured index and avoids fallback to untrusted sources (prevents
dependency confusion).
- Line 1: The FROM line in Containerfile currently uses a mutable tag
("quay.io/opendatahub/odh-midstream-python-base-3-12:latest"); replace this with
an immutable digest by inspecting the registry (e.g., podman inspect or docker
inspect to get the image Digest) and change the FROM to use "`@sha256`:<digest>"
instead of ":latest" so builds are reproducible; update any CI/automation that
builds this image to intentionally refresh the digest on approved updates.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 3fdac1d1-9434-4a54-ba6d-063c82f88b1f
📒 Files selected for processing (12)
.github/mergify.yml.github/workflows/redhat-distro-container.yml.pre-commit-config.yamldistribution/Containerfiledistribution/Containerfile.indistribution/build.envdistribution/build.pydistribution/constraints.txtdistribution/install-deps.shdistribution/requirements.txtdistribution/versions.envscripts/gen_distro_docs.py
💤 Files with no reviewable changes (3)
- distribution/versions.env
- distribution/Containerfile.in
- distribution/install-deps.sh
| - name: Regenerate distribution artifacts from an arbitrary ogx commit (workflow_dispatch/schedule) | ||
| if: contains(fromJSON('["workflow_dispatch", "schedule"]'), github.event_name) | ||
| env: | ||
| OGX_VERSION: ${{ env.OGX_COMMIT_SHA }} | ||
| run: | | ||
| tmp_build_dir=$(mktemp -d) | ||
| git clone --filter=blob:none --no-checkout https://github.com/opendatahub-io/ogx.git "$tmp_build_dir" | ||
| cd "$tmp_build_dir" | ||
| git checkout "$OGX_VERSION" | ||
| python3 -m venv .venv | ||
| source .venv/bin/activate | ||
| # Extract base version from pyproject.toml (e.g., 0.8.0+rhaiv.0 -> 0.8.0) to pin ogx-api | ||
| OGX_BASE_VERSION=$(python3 -c "import tomllib; t=tomllib.load(open('pyproject.toml','rb')); print(t.get('project',{}).get('version', t.get('tool',{}).get('setuptools_scm',{}).get('fallback_version','')).split('+')[0])") | ||
| pip install --no-cache -e . | ||
| pip install --no-cache "ogx-api==${OGX_BASE_VERSION}" ruamel.yaml | ||
| # now remove the install line from the Containerfile | ||
| cd - | ||
| python3 distribution/build.py | ||
| sed -i '/^RUN.*pip install.*ogx==/d' distribution/Containerfile | ||
| run: uv run --with ruamel.yaml distribution/build.py |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What is the latest stable ruamel.yaml release, and is pinning uv run --with ruamel.yaml==<exact> recommended for reproducible GitHub Actions builds?
💡 Result:
The latest stable release of ruamel.yaml is 0.19.1, released on 2026-01-02 [1]. Regarding your question on reproducibility in GitHub Actions: pinning a dependency with uv run --with ruamel.yaml== is not the recommended approach for ensuring reproducible builds in a project-based workflow [2][3]. While --with can be used to inject specific versions for ad-hoc commands, it does not leverage the project's lockfile (uv.lock), which is the standard mechanism for ensuring deterministic, reproducible environments [2][4]. For reproducible GitHub Actions builds, you should: 1. Maintain a project-level configuration (pyproject.toml) and a generated lockfile (uv.lock) [2][4]. 2. Use uv sync --locked in your GitHub Actions workflow to ensure the environment exactly matches the lockfile [3][5]. 3. Use uv run to execute commands within that synchronized, locked environment [3]. Using --with bypasses the project's locked dependencies and may lead to non-deterministic behavior or dependency conflicts, as it triggers a new resolution process for the requested packages [6][2]. If you must use --with for specific tasks, be aware that it is intended for ephemeral, isolated invocations rather than as a substitute for a project-wide lockfile [6][4].
Citations:
- 1: https://pypi.org/project/ruamel.yaml/
- 2: https://docs.astral.sh/uv/pip/compile/
- 3: https://docs.astral.sh/uv/guides/integration/github/
- 4: https://medium.com/@walczak.coding/python-scripts-that-just-work-how-uv-makes-dependencies-disappear-32017df19b3c
- 5: Add --frozen flag to "sync-and-running" documentation astral-sh/uv#10793
- 6: https://docs.astral.sh/uv/concepts/projects/run/
Use uv sync --locked with a lockfile instead of unpinned --with dependencies (CWE-494).
The uv run --with ruamel.yaml command bypasses the project's locked dependencies and triggers non-deterministic resolution at runtime, creating supply-chain exposure. Per uv documentation, reproducible builds require:
- Maintain
pyproject.tomland generateduv.lock - Use
uv sync --lockedto pin the environment deterministically - Run commands within that locked environment
Update the workflow to use uv sync --locked before executing distribution/build.py.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/redhat-distro-container.yml around lines 105 - 109, The
workflow currently uses "uv run --with ruamel.yaml distribution/build.py", which
bypasses the lockfile; change it to first run "uv sync --locked" to install
pinned deps from uv.lock and then execute the build inside that locked
environment (invoke distribution/build.py via uv run without the unpinned
"--with" flag); ensure OGX_VERSION env remains set and both commands run in the
same job step so the deterministic locked environment is used for running
distribution/build.py.
| # Generated by: distribution/build.py | ||
| FROM registry.access.redhat.com/ubi9/python-312@sha256:95ec8d3ee9f875da011639213fd254256c29bc58861ac0b11f290a291fa04435 | ||
| WORKDIR /opt/app-root | ||
| FROM quay.io/opendatahub/odh-midstream-python-base-3-12:latest |
There was a problem hiding this comment.
❓ Verification inconclusive
What is the immutable sha256 digest for quay.io/opendatahub/odh-midstream-python-base-3-12 that should replace the mutable :latest tag?
There is no single, static immutable SHA256 digest for the quay.io/opendatahub/odh-midstream-python-base-3-12 image because it is a frequently updated base image maintained by the Open Data Hub project [1][2]. As of May 19, 2026, the project continues to release updates to its base containers [3][4]. To obtain the current, immutable digest for your specific environment, you must inspect the image directly from the registry. You can retrieve the digest using the following command: podman inspect quay.io/opendatahub/odh-midstream-python-base-3-12 --format '{{.Digest}}' Alternatively, if you are using Docker, you can use: docker inspect quay.io/opendatahub/odh-midstream-python-base-3-12 --format '{{index.RepoDigests 0}}' Using a pinned digest is a recommended security practice to ensure reproducibility and prevent silent updates [5]. Because this image is updated regularly (e.g., via automated dependency updates [3]), you should treat the digest as a temporary pin that requires intentional updates as part of your CI/CD workflow [5].
Citations:
- 1: https://github.com/opendatahub-io/base-containers
- 2: https://github.com/smoparth/base-containers
- 3: chore(deps): update base-images base-containers#230
- 4: opendatahub-io/base-containers@4e80a56
- 5: https://dev.to/nkuziev-sec/why-fixed-container-image-versions-matter-lessons-from-the-trivy-supply-chain-attack-29ke
Script executed:
fd -t f ContainerfileRepository: opendatahub-io/ogx-distribution
Repository: opendatahub-io/ogx-distribution
Exit code: 0
stdout:
distribution/Containerfile
vllm/Containerfile
Script executed:
cat -n distribution/ContainerfileRepository: opendatahub-io/ogx-distribution
Repository: opendatahub-io/ogx-distribution
Exit code: 0
stdout:
1 FROM quay.io/opendatahub/odh-midstream-python-base-3-12:latest
2
3 ENV UV_INDEX_STRATEGY=unsafe-best-match
4
5 COPY distribution/constraints.txt ${APP_ROOT}/constraints.txt
6 COPY distribution/requirements.txt ${APP_ROOT}/requirements.txt
7 RUN uv pip install --constraint ${APP_ROOT}/constraints.txt -r ${APP_ROOT}/requirements.txt
8
9 COPY --chmod=755 distribution/install-common.sh ${APP_ROOT}/install-common.sh
10 RUN ${APP_ROOT}/install-common.sh
11 ENV TIKTOKEN_CACHE_DIR="${HOME}/.cache/tiktoken"
12
13 COPY distribution/config.yaml ${APP_ROOT}/config.yaml
14 COPY --chmod=755 distribution/entrypoint.sh ${APP_ROOT}/entrypoint.sh
15
16 `#TODO`: remove this once we have a stable version of ogx
17 ENV OGX_DISABLE_VERSION_CHECK=true
18
19 ENTRYPOINT [ "/opt/app-root/entrypoint.sh" ]
Pin the base image by digest instead of :latest (CWE-494).
Using a mutable tag makes builds non-reproducible and weakens supply-chain integrity. Pin to an immutable digest by inspecting the image registry: podman inspect quay.io/opendatahub/odh-midstream-python-base-3-12 --format '{{.Digest}}' or equivalent Docker command, then replace :latest with @sha256:<digest>. Treat the digest as a temporary pin requiring intentional updates as part of CI/CD workflows.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@distribution/Containerfile` at line 1, The FROM line in Containerfile
currently uses a mutable tag
("quay.io/opendatahub/odh-midstream-python-base-3-12:latest"); replace this with
an immutable digest by inspecting the registry (e.g., podman inspect or docker
inspect to get the image Digest) and change the FROM to use "`@sha256`:<digest>"
instead of ":latest" so builds are reproducible; update any CI/automation that
builds this image to intentionally refresh the digest on approved updates.
|
|
||
| RUN pip install uv | ||
| ENV UV_INDEX_STRATEGY=unsafe-best-match |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In uv, what are the security implications of UV_INDEX_STRATEGY=unsafe-best-match versus first-index for preventing dependency confusion?
💡 Result:
In uv, the UV_INDEX_STRATEGY setting controls how the resolver interacts with multiple package indexes, which has significant implications for dependency confusion security [1][2]. The default strategy, first-index, is designed to prevent dependency confusion attacks [1][2]. Under this strategy, uv stops searching at the first index where a package is found and limits all candidate versions to those available on that specific index [1][2]. This ensures that if an internal package exists on a private index, uv will not "fall back" to a public index (like PyPI) even if a malicious actor has published a package with the same name there [3][2]. In contrast, unsafe-best-match is an opt-in strategy that searches for a package across all configured indexes and selects the "best" (typically the highest) version found, regardless of which index it originates from [1][2]. While this behavior is more similar to traditional tools like pip and can be necessary for complex dependency resolution scenarios (such as when remediated packages depend on non-remediated ones), it explicitly exposes the project to dependency confusion attacks [3][4][2]. By allowing the resolver to consider versions from public indexes, an attacker can potentially force the installation of a malicious package by publishing a higher version number on a public repository [3][2]. Because of these risks, first-index is the recommended default for security [1][5]. Users who require unsafe-best-match for specific resolution needs should be aware that they are bypassing the built-in protection against package shadowing [3][2].
Citations:
- 1: https://docs.astral.sh/uv/reference/settings/
- 2: https://docs.astral.sh/uv/concepts/indexes/
- 3: https://git.joshthomas.dev/mirrors/uv/src/commit/c58675fdac6bdc16a3c0ab4090600282d49db369/docs/configuration/indexes.md
- 4: https://edu.chainguard.dev/chainguard/libraries/python/build-configuration/
- 5: https://news.ycombinator.com/item?id=41383974
Replace unsafe-best-match with first-index to prevent dependency confusion attacks.
unsafe-best-match searches across all configured package indexes and selects the highest version regardless of origin, exposing the build to CWE-829 dependency confusion. An attacker can publish a malicious package with a higher version on a public index to force installation. Use first-index (the secure default), which stops at the first matching index and prevents fallback to untrusted sources.
Suggested fix
-ENV UV_INDEX_STRATEGY=unsafe-best-match
+ENV UV_INDEX_STRATEGY=first-index📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ENV UV_INDEX_STRATEGY=unsafe-best-match | |
| ENV UV_INDEX_STRATEGY=first-index |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@distribution/Containerfile` at line 3, Update the UV_INDEX_STRATEGY
environment variable in the Containerfile: locate the ENV UV_INDEX_STRATEGY
declaration and change its value from "unsafe-best-match" to "first-index" so
the package resolver stops at the first configured index and avoids fallback to
untrusted sources (prevents dependency confusion).
d15937c to
1a002ad
Compare
…equirements.txt Switch base image from ubi9/python-312 to odh-midstream-python-base-3-12. Consolidate ogx and all dependencies into a single requirements.txt, eliminating Containerfile.in templating, install-deps.sh, and multiple uv pip install calls. ogx install mode (git vs index) is now controlled by OGX_INSTALL_FROM_SOURCE in build.env. OpenTelemetry instrumentation packages are discovered via a temporary venv matching the container's dependency set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1a002ad to
f62172b
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
distribution/build.py (1)
246-259: ⚡ Quick winMissing error context on
opentelemetry-bootstrapfailure.
get_dependencies(line 239-243) catchesCalledProcessErrorand prints stderr before exiting. This function lacks equivalent handling, causing opaque tracebacks if bootstrap fails.Suggested fix
def get_opentelemetry_packages(bootstrap_bin: Path) -> list[str]: """Run opentelemetry-bootstrap to discover instrumentation packages.""" - result = subprocess.run( - [str(bootstrap_bin), "-a", "requirements"], - capture_output=True, - text=True, - check=True, - ) + try: + result = subprocess.run( + [str(bootstrap_bin), "-a", "requirements"], + capture_output=True, + text=True, + check=True, + ) + except subprocess.CalledProcessError as e: + print(f"Error executing opentelemetry-bootstrap: {e}") + print(f"stdout: {e.stdout}") + print(f"stderr: {e.stderr}") + sys.exit(1) packages = []🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@distribution/build.py` around lines 246 - 259, get_opentelemetry_packages currently calls subprocess.run on opentelemetry-bootstrap without handling CalledProcessError, so failures produce opaque tracebacks; update get_opentelemetry_packages to wrap the subprocess.run call in a try/except catching subprocess.CalledProcessError, log or print result.stderr (same pattern used by get_dependencies) and exit with a nonzero code on failure, while keeping the existing parsing logic for result.stdout and the special-case filter for "opentelemetry-instrumentation-botocore".distribution/constraints.txt (1)
4-4: 💤 Low valueAdd comment explaining
setuptools<81.0.0constraint.The
pymilvuspin has clear rationale documented. Thesetuptoolsupper bound lacks context, making future maintenance harder. What breaks with setuptools 81+?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@distribution/constraints.txt` at line 4, The `setuptools<81.0.0` upper bound lacks an explanation; update distribution/constraints.txt to add an inline comment next to the setuptools<81.0.0 line that states what breaks with setuptools 81+ and links to the root cause (e.g., failing package, traceback, GitHub issue or commit) so future maintainers can verify/reenable newer versions; reference the exact token "setuptools<81.0.0" in the comment and, if available, include an issue/PR number or a short repro note to make the rationale discoverable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@distribution/build.py`:
- Around line 246-259: get_opentelemetry_packages currently calls subprocess.run
on opentelemetry-bootstrap without handling CalledProcessError, so failures
produce opaque tracebacks; update get_opentelemetry_packages to wrap the
subprocess.run call in a try/except catching subprocess.CalledProcessError, log
or print result.stderr (same pattern used by get_dependencies) and exit with a
nonzero code on failure, while keeping the existing parsing logic for
result.stdout and the special-case filter for
"opentelemetry-instrumentation-botocore".
In `@distribution/constraints.txt`:
- Line 4: The `setuptools<81.0.0` upper bound lacks an explanation; update
distribution/constraints.txt to add an inline comment next to the
setuptools<81.0.0 line that states what breaks with setuptools 81+ and links to
the root cause (e.g., failing package, traceback, GitHub issue or commit) so
future maintainers can verify/reenable newer versions; reference the exact token
"setuptools<81.0.0" in the comment and, if available, include an issue/PR number
or a short repro note to make the rationale discoverable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 47db75c2-2f87-46ad-adc5-55950b03a031
📒 Files selected for processing (13)
.github/mergify.yml.github/workflows/redhat-distro-container.yml.pre-commit-config.yamldistribution/Containerfiledistribution/Containerfile.indistribution/build.envdistribution/build.pydistribution/constraints.txtdistribution/install-deps.shdistribution/requirements.txtdistribution/versions.envscripts/gen_distro_docs.pytests/run_integration_tests.sh
💤 Files with no reviewable changes (3)
- distribution/Containerfile.in
- distribution/install-deps.sh
- distribution/versions.env
✅ Files skipped from review due to trivial changes (2)
- .github/mergify.yml
- distribution/requirements.txt
🚧 Files skipped from review as they are similar to previous changes (2)
- distribution/build.env
- .github/workflows/redhat-distro-container.yml
Split the distribution/ directory into build/ (source config and tooling) and dist/ (generated artifacts and container runtime files) to make the source vs. generated distinction structural. Move the Containerfile to the repo root to match its build context. - build/: build.py, build.yaml, build.env, gen_distro_docs.py - dist/: config.yaml, requirements.txt, constraints.txt, entrypoint.sh, install-common.sh, README.md (all generated files marked via .gitattributes) - Containerfile at repo root (conventional placement) - Updated all CI workflows, Tekton pipelines, Mergify, and pre-commit - Fixed stale README documentation (removed Containerfile.in references) - Removed stale pre-commit exclude: 'build/' directive Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/redhat-distro-container.yml:
- Around line 15-16: Add the build/** pattern to the workflow's trigger and
publish-change detection lists so changes under build/ (e.g., build/build.py or
build/build.env) will run this workflow; update the arrays that currently
include 'Containerfile' and 'dist/**' (and the similar block around lines 89-95)
to also include 'build/**' to ensure CI and publish gating cover build pipeline
changes.
In `@build/build.py`:
- Around line 193-196: The writes using open(output_path, "w") and
ryaml.dump(config, f) assume the parent directory exists; ensure the parent
directory is created before opening the file by calling a directory-creation
helper (e.g., use os.makedirs or Path(...).parent.mkdir with parents=True,
exist_ok=True) for the parent of output_path, and apply the same fix for the
other occurrence around the calls at lines 275-276 so both artifact writes won't
fail on a clean checkout.
- Around line 229-236: The current re.sub inside the packages list comprehension
is too permissive and rewrites dotted package names into extras (corrupting
valid requirements); restrict the rewrite to only match a single dot immediately
after the base package (no other dots to the left) and anchor to the start so
you only convert true "package.extra" forms, and perform a single substitution.
Concretely, update the re.sub call used in the packages comprehension to use a
regex anchored at the start that captures a left-side token without dots (e.g.
^([^.\s]+)\.([a-zA-Z_][a-zA-Z0-9_]*)(==|>=|<=|>|<|~=|!=)) and replace with
\1[\2]\3, and pass count=1 to re.sub so only the intended occurrence is
converted.
- Around line 216-220: The loop that currently skips index flags (checking
parts_list[i] against "--extra-index-url" and "--index-url" and doing i += 2)
silently drops private index configuration; instead preserve these flags and
their values in the output. Update the logic around parts_list and the
index-flag branch so that when parts_list[i] is "--index-url" or
"--extra-index-url" you append both the flag and its following value to the
output list (or otherwise record them for the generated requirements/command)
and then advance i by 2, rather than discarding them.
In `@build/gen_distro_docs.py`:
- Around line 15-17: The code in gen_distro_docs.py currently hard-codes
repo_owner (e.g., 'opendatahub-io') causing broken links; change it to parse the
configured OGX repo instead: import or accept the OGX_GIT_REPO value from
build.py (or read the same config source) and extract the owner component to set
repo_owner used when building release/commit links for dist/README.md; update
the function or variable that returns the tuple (version, repo_owner) so it
derives repo_owner from OGX_GIT_REPO rather than using the hard-coded literal.
In `@README.md`:
- Line 38: Replace the vague link text "here" in the README sentence that points
to dist/README.md with a descriptive anchor such as "distribution README" (or
similar meaningful text) so the link reads like "you can read more about that in
the distribution README"; update the anchor text where the sentence currently
contains "here" to improve clarity and accessibility.
In `@tests/README.md`:
- Around line 151-153: Update the OGX version workflow step that currently
states pre-commit regenerates "Containerfile": change the description in
tests/README.md (the list under the workflow that references CURRENT_OGX_VERSION
and build/build.py) to instead state that running pre-commit regenerates the
distribution artifacts in dist/ (specifically dist/config.yaml,
dist/requirements.txt, dist/README.md) so contributors are not misled into
thinking Containerfile is auto-generated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 916447a8-eed3-4093-b743-724745f2a6ad
⛔ Files ignored due to path filters (6)
dist/README.mdis excluded by!**/dist/**dist/config.yamlis excluded by!**/dist/**dist/constraints.txtis excluded by!**/dist/**dist/entrypoint.shis excluded by!**/dist/**dist/install-common.shis excluded by!**/dist/**dist/requirements.txtis excluded by!**/dist/**
📒 Files selected for processing (16)
.gitattributes.github/mergify.yml.github/workflows/redhat-distro-container.yml.github/workflows/test-pr-in-showroom.yml.github/workflows/update-ogx-version.yml.pre-commit-config.yaml.tekton/odh-ogx-core-pull-request.yaml.tekton/odh-ogx-core-push.yamlContainerfileREADME.mdbuild/build.envbuild/build.pybuild/build.yamlbuild/gen_distro_docs.pytests/README.mdtests/run_integration_tests.sh
✅ Files skipped from review due to trivial changes (5)
- .tekton/odh-ogx-core-pull-request.yaml
- .gitattributes
- build/build.env
- Containerfile
- .tekton/odh-ogx-core-push.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/run_integration_tests.sh
| - 'Containerfile' | ||
| - 'dist/**' |
There was a problem hiding this comment.
Add build/** to workflow trigger and publish-change detection.
Changes limited to build/ (for example build/build.py or build/build.env) currently won’t be treated as relevant here, so artifact-generation logic can change without running this workflow or publish gating. That creates a CI coverage gap for the build pipeline.
Suggested patch
pull_request:
...
paths:
- '.github/actions/**'
- '.github/workflows/redhat-distro-container.yml'
- 'Containerfile'
+ - 'build/**'
- 'dist/**'
- 'tests/**'
...
- if git diff --name-only HEAD^ HEAD | grep -qE '^(dist/|Containerfile$)'; then
+ if git diff --name-only HEAD^ HEAD | grep -qE '^(build/|dist/|Containerfile$)'; then
echo "changed=true" >> "$GITHUB_OUTPUT"
- echo "dist/ or Containerfile was modified in this push, will publish"
+ echo "build/, dist/, or Containerfile was modified in this push, will publish"
else
echo "changed=false" >> "$GITHUB_OUTPUT"
- echo "dist/ and Containerfile were not modified in this push, skipping publish"
+ echo "build/, dist/, and Containerfile were not modified in this push, skipping publish"
fiAlso applies to: 89-95
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/redhat-distro-container.yml around lines 15 - 16, Add the
build/** pattern to the workflow's trigger and publish-change detection lists so
changes under build/ (e.g., build/build.py or build/build.env) will run this
workflow; update the arrays that currently include 'Containerfile' and 'dist/**'
(and the similar block around lines 89-95) to also include 'build/**' to ensure
CI and publish gating cover build pipeline changes.
| with open(output_path, "w") as f: | ||
| f.write(header) | ||
| ryaml.dump(config, f) | ||
|
|
There was a problem hiding this comment.
Create dist/ before writing generated artifacts.
Both writes assume the parent directory already exists. In a clean checkout, this fails before artifact generation.
Suggested fix
def generate_stripped_config():
@@
output_path = Path("dist/config.yaml")
+ output_path.parent.mkdir(parents=True, exist_ok=True)
@@
def generate_requirements_file(dependencies, ogx_reqs: OgxRequirements, otel_packages):
@@
output_path = Path("dist/requirements.txt")
+ output_path.parent.mkdir(parents=True, exist_ok=True)As per coding guidelines, prioritize “Bug-prone patterns and error handling gaps.”
Also applies to: 275-276
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@build/build.py` around lines 193 - 196, The writes using open(output_path,
"w") and ryaml.dump(config, f) assume the parent directory exists; ensure the
parent directory is created before opening the file by calling a
directory-creation helper (e.g., use os.makedirs or Path(...).parent.mkdir with
parents=True, exist_ok=True) for the parent of output_path, and apply the same
fix for the other occurrence around the calls at lines 275-276 so both artifact
writes won't fail on a clean checkout.
| if parts_list[i] in ( | ||
| "--extra-index-url", | ||
| "--index-url", | ||
| ) and i + 1 < len(parts_list): | ||
| i += 2 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether trusted index settings are preserved anywhere in generated/install paths.
set -euo pipefail
echo "1) Locations where index URL flags are stripped or used:"
rg -n --no-heading --fixed-strings -- '--index-url' build/build.py Containerfile distribution/Containerfile .github/workflows .tekton || true
rg -n --no-heading --fixed-strings -- '--extra-index-url' build/build.py Containerfile distribution/Containerfile .github/workflows .tekton || true
echo
echo "2) Check for env-based index config (PIP_INDEX_URL/UV_INDEX_URL):"
rg -n --no-heading -e 'PIP_INDEX_URL|PIP_EXTRA_INDEX_URL|UV_INDEX_URL|UV_EXTRA_INDEX_URL' Containerfile distribution/Containerfile .github/workflows .tekton || trueRepository: opendatahub-io/ogx-distribution
Length of output: 532
🏁 Script executed:
wc -l build/build.pyRepository: opendatahub-io/ogx-distribution
Length of output: 94
🏁 Script executed:
sed -n '200,250p' build/build.pyRepository: opendatahub-io/ogx-distribution
Length of output: 1836
🏁 Script executed:
sed -n '1,50p' build/build.pyRepository: opendatahub-io/ogx-distribution
Length of output: 1407
🏁 Script executed:
grep -n "def get_dependencies" build/build.pyRepository: opendatahub-io/ogx-distribution
Length of output: 129
🏁 Script executed:
sed -n '170,280p' build/build.pyRepository: opendatahub-io/ogx-distribution
Length of output: 3637
🏁 Script executed:
grep -n "generate_requirements_file\|requirements.txt" build/build.pyRepository: opendatahub-io/ogx-distribution
Length of output: 411
🏁 Script executed:
grep -rn "PIP_INDEX_URL\|UV_INDEX_URL\|--index-url\|--extra-index-url" . --include="*.yaml" --include="*.yml" --include="Containerfile" --include="*.sh" 2>/dev/null | head -20Repository: opendatahub-io/ogx-distribution
Length of output: 57
🏁 Script executed:
grep -n "dist/requirements.txt" . -r --include="Containerfile" --include="*.dockerfile" --include="*.yaml" 2>/dev/null || trueRepository: opendatahub-io/ogx-distribution
Length of output: 149
🏁 Script executed:
sed -n '200,240p' build/build.pyRepository: opendatahub-io/ogx-distribution
Length of output: 1472
🏁 Script executed:
grep -B5 -A5 "list-deps" build/build.pyRepository: opendatahub-io/ogx-distribution
Length of output: 492
🏁 Script executed:
cat build/build.yaml | head -30Repository: opendatahub-io/ogx-distribution
Length of output: 1015
Do not discard index URL configuration; stripping enables dependency confusion (CWE-829).
Lines 216–220 silently drop --index-url and --extra-index-url flags from ogx stack list-deps output. These index sources are not preserved in the generated requirements.txt or any fallback configuration. During pip install, dependencies will resolve from PyPI defaults, allowing a same-name package on the public index to shadow a trusted or private dependency.
Capture and preserve index arguments in the output so downstream installation respects the original configuration.
Suggested fix
-def get_dependencies(ogx_bin: Path) -> list[str]:
+def get_dependencies(ogx_bin: Path) -> tuple[list[str], list[str]]:
"""Execute the ogx list-deps command and return a list of package specifiers."""
cmd = [str(ogx_bin), "stack", "list-deps", "build/build.yaml"]
try:
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
packages = []
+ index_args: list[str] = []
for line in result.stdout.splitlines():
line = line.strip()
if not line:
continue
parts_list = shlex.split(line)
i = 0
while i < len(parts_list):
if parts_list[i] in (
"--extra-index-url",
"--index-url",
) and i + 1 < len(parts_list):
+ index_args.extend([parts_list[i], parts_list[i + 1]])
i += 2
elif parts_list[i].startswith("--"):
i += 1
else:
packages.append(parts_list[i])
i += 1
- return packages
+ return packages, index_args🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@build/build.py` around lines 216 - 220, The loop that currently skips index
flags (checking parts_list[i] against "--extra-index-url" and "--index-url" and
doing i += 2) silently drops private index configuration; instead preserve these
flags and their values in the output. Update the logic around parts_list and the
index-flag branch so that when parts_list[i] is "--index-url" or
"--extra-index-url" you append both the flag and its following value to the
output list (or otherwise record them for the generated requirements/command)
and then advance i by 2, rather than discarding them.
| packages = [ | ||
| re.sub( | ||
| r"\.([a-zA-Z_][a-zA-Z0-9_]*)(==|>=|<=|>|<|~=|!=)", | ||
| r"[\1]\2", | ||
| pkg, | ||
| ) | ||
| for pkg in packages | ||
| ] |
There was a problem hiding this comment.
Constrain the dot-to-extras rewrite; current regex corrupts valid requirement names.
Line 230-Line 233 rewrites any .<token><comparator> pattern, so dotted package names can be mutated into extras syntax (incorrect requirement string).
Suggested fix
- packages = [
- re.sub(
- r"\.([a-zA-Z_][a-zA-Z0-9_]*)(==|>=|<=|>|<|~=|!=)",
- r"[\1]\2",
- pkg,
- )
- for pkg in packages
- ]
+ def _rewrite_namespace_extra(pkg: str) -> str:
+ m = re.match(
+ r"^(llama_stack_provider_[A-Za-z0-9_]+)\.([A-Za-z_][A-Za-z0-9_]*)(==|>=|<=|>|<|~=|!=)(.+)$",
+ pkg,
+ )
+ if not m:
+ return pkg
+ base, extra, op, version = m.groups()
+ return f"{base}[{extra}]{op}{version}"
+
+ packages = [_rewrite_namespace_extra(pkg) for pkg in packages]As per coding guidelines, prioritize “Bug-prone patterns and error handling gaps.”
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| packages = [ | |
| re.sub( | |
| r"\.([a-zA-Z_][a-zA-Z0-9_]*)(==|>=|<=|>|<|~=|!=)", | |
| r"[\1]\2", | |
| pkg, | |
| ) | |
| for pkg in packages | |
| ] | |
| def _rewrite_namespace_extra(pkg: str) -> str: | |
| m = re.match( | |
| r"^(llama_stack_provider_[A-Za-z0-9_]+)\.([A-Za-z_][A-Za-z0-9_]*)(==|>=|<=|>|<|~=|!=)(.+)$", | |
| pkg, | |
| ) | |
| if not m: | |
| return pkg | |
| base, extra, op, version = m.groups() | |
| return f"{base}[{extra}]{op}{version}" | |
| packages = [_rewrite_namespace_extra(pkg) for pkg in packages] |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@build/build.py` around lines 229 - 236, The current re.sub inside the
packages list comprehension is too permissive and rewrites dotted package names
into extras (corrupting valid requirements); restrict the rewrite to only match
a single dot immediately after the base package (no other dots to the left) and
anchor to the start so you only convert true "package.extra" forms, and perform
a single substitution. Concretely, update the re.sub call used in the packages
comprehension to use a regex anchored at the start that captures a left-side
token without dots (e.g.
^([^.\s]+)\.([a-zA-Z_][a-zA-Z0-9_]*)(==|>=|<=|>|<|~=|!=)) and replace with
\1[\2]\3, and pass count=1 to re.sub so only the intended occurrence is
converted.
| tuple: (version, repo_owner) where repo_owner is extracted from | ||
| OGX_GIT_REPO in build.py or defaults to 'opendatahub-io' | ||
| """ |
There was a problem hiding this comment.
Stop hard-coding repo_owner; derive it from configured OGX repo.
Line 41 ignores the configured repo and can generate broken release/commit links in dist/README.md when the upstream owner changes.
Suggested fix
def extract_ogx_version():
@@
- return (version, "opendatahub-io")
+ build_py = REPO_ROOT / "build" / "build.py"
+ repo_owner = "opendatahub-io"
+ if build_py.exists():
+ text = build_py.read_text()
+ m = re.search(r'OGX_GIT_REPO\s*=\s*"https://github\.com/([^/]+)/ogx\.git"', text)
+ if m:
+ repo_owner = m.group(1)
+ return (version, repo_owner)As per coding guidelines, prioritize “Architectural issues and anti-patterns.”
Also applies to: 41-41
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@build/gen_distro_docs.py` around lines 15 - 17, The code in
gen_distro_docs.py currently hard-codes repo_owner (e.g., 'opendatahub-io')
causing broken links; change it to parse the configured OGX repo instead: import
or accept the OGX_GIT_REPO value from build.py (or read the same config source)
and extract the owner component to set repo_owner used when building
release/commit links for dist/README.md; update the function or variable that
returns the tuple (version, repo_owner) so it derives repo_owner from
OGX_GIT_REPO rather than using the hard-coded literal.
|
|
||
| > [!TIP] | ||
| > Ensure you include any env vars you might need for providers in the container env - you can read more about that [here](distribution/README.md). | ||
| > Ensure you include any env vars you might need for providers in the container env - you can read more about that [here](dist/README.md). |
There was a problem hiding this comment.
Replace vague link text with descriptive anchor text.
Line 38 uses “here,” which is ambiguous in rendered docs and hurts scanability/accessibility. Use descriptive link text (for example, “distribution README”).
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 38-38: Link text should be descriptive
(MD059, descriptive-link-text)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 38, Replace the vague link text "here" in the README
sentence that points to dist/README.md with a descriptive anchor such as
"distribution README" (or similar meaningful text) so the link reads like "you
can read more about that in the distribution README"; update the anchor text
where the sentence currently contains "here" to improve clarity and
accessibility.
| 2. **Updates** `CURRENT_OGX_VERSION` in `build/build.py` | ||
| 3. **Runs pre-commit** to regenerate distribution artifacts (Containerfile, README) | ||
| 4. **Opens a pull request** against `main` with the version bump |
There was a problem hiding this comment.
Fix stale artifact regeneration description in OGX version workflow section.
Line 152 says pre-commit regenerates Containerfile, but current flow documents generated artifacts in dist/ (dist/config.yaml, dist/requirements.txt, dist/README.md) while Containerfile is hand-edited. Update this line to prevent contributor confusion during version bumps.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/README.md` around lines 151 - 153, Update the OGX version workflow step
that currently states pre-commit regenerates "Containerfile": change the
description in tests/README.md (the list under the workflow that references
CURRENT_OGX_VERSION and build/build.py) to instead state that running pre-commit
regenerates the distribution artifacts in dist/ (specifically dist/config.yaml,
dist/requirements.txt, dist/README.md) so contributors are not misled into
thinking Containerfile is auto-generated.
Switch base image from
ubi9/python-312toodh-midstream-python-base-3-12. Consolidateogxand all dependencies into a singlerequirements.txt, eliminatingContainerfile.intemplating,install-deps.sh, and multipleuv pip installcalls. ogx install mode (git vs index) is now controlled byOGX_INSTALL_FROM_SOURCEinbuild.env. OpenTelemetry instrumentation packages are discovered via a temporary venv matching the container's dependency set.Stripped out lots of unused dependencies - image is now ~330 MB smaller.
Commit 72fb720 is just a restructure. It splits
distribution/intobuild/anddist/, and moves the Containerfile to the root of the repo.Summary by CodeRabbit
Release Notes
build/directorydist/directory (replacing previous manual processes)