feat: allow for custom run.yaml at container runtime#84
feat: allow for custom run.yaml at container runtime#84mergify[bot] merged 1 commit intoopendatahub-io:mainfrom
Conversation
WalkthroughReplaces the container ENTRYPOINT with a new executable script and adds that script to the image; the script chooses the llama stack run target from RUN_CONFIG_PATH, DISTRO_NAME, or a bundled /opt/app-root/run.yaml. README gains instructions for running with a custom run YAML. Changes
Sequence Diagram(s)sequenceDiagram
participant Container as Container Start
participant Entrypoint as /opt/app-root/entrypoint.sh
participant Llama as llama stack run
Container->>Entrypoint: exec /opt/app-root/entrypoint.sh "$@"
alt RUN_CONFIG_PATH set & exists
Entrypoint->>Llama: llama stack run "$RUN_CONFIG_PATH" "$@"
else if DISTRO_NAME set
Entrypoint->>Llama: llama stack run "$DISTRO_NAME" "$@"
else
Entrypoint->>Llama: llama stack run /opt/app-root/run.yaml "$@"
end
Llama-->>Container: process starts
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used🪛 LanguageToolREADME.md[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long) (EN_EXCESSIVE_EXCLAMATION) [grammar] ~70-~70: Use a hyphen to join words. (QB_NEW_EN_HYPHEN) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
49bc262 to
07f27cb
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
distribution/build.py (1)
55-73: Consider making the version threshold configurable.The hardcoded threshold version
"0.2.23"(line 64) matches the currentCURRENT_LLAMA_STACK_VERSION(line 17). When the version is bumped in the future, maintainers must remember to update this threshold if the entrypoint behavior needs to change, creating a maintenance burden.Consider extracting the threshold as a module-level constant or deriving it from
CURRENT_LLAMA_STACK_VERSION:# Near the top of the file, after CURRENT_LLAMA_STACK_VERSION ENTRYPOINT_CHANGE_VERSION = "0.2.23" # Version where entrypoint changed from python -m to llama stack run def get_entrypoint(llama_stack_version): """Determine the appropriate ENTRYPOINT based on llama-stack version.""" if is_install_from_source(llama_stack_version): return 'ENTRYPOINT ["llama", "stack", "run"]' try: current_version = version.parse(llama_stack_version) threshold_version = version.parse(ENTRYPOINT_CHANGE_VERSION) # ... rest of functionAdditionally, consider narrowing the exception handling on line 70 from
except Exceptiontoexcept version.InvalidVersionto catch only version parsing errors, allowing other unexpected errors to surface properly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
distribution/Containerfile(1 hunks)distribution/Containerfile.in(1 hunks)distribution/build.py(5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-15T14:25:54.837Z
Learnt from: nathan-weinberg
PR: opendatahub-io/llama-stack-distribution#33
File: distribution/Containerfile:17-21
Timestamp: 2025-09-15T14:25:54.837Z
Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.
Applied to files:
distribution/Containerfiledistribution/Containerfile.in
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (3)
distribution/build.py (2)
15-15: LGTM!The
packaging.versionimport is appropriate for semantic version comparison and ensures correct handling of version strings.
195-216: LGTM!The integration of the
entrypointparameter intogenerate_containerfile()and the main flow is correct and complete. The function signature update, template substitution, and call site changes properly implement the dynamic entrypoint feature.Also applies to: 246-250
distribution/Containerfile (1)
62-63: Critical: Environment variable won't expand in CMD exec form.Docker does not expand environment variables in JSON array (exec) form. The literal string
"${APP_ROOT}/run.yaml"will be passed to the entrypoint instead of the expanded path/opt/app-root/run.yaml, causing the default startup to fail.Solution: Use the literal path in the JSON array:
ENTRYPOINT ["llama", "stack", "run"] -CMD ["${APP_ROOT}/run.yaml"] +CMD ["/opt/app-root/run.yaml"]Note: Users can still override the config path at runtime with
docker run <image> /custom/path/to/run.yaml, which satisfies the PR objective to allow custom run.yaml files.⛔ Skipped due to learnings
Learnt from: nathan-weinberg PR: opendatahub-io/llama-stack-distribution#33 File: distribution/Containerfile:17-21 Timestamp: 2025-09-15T14:25:54.837Z Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.
07f27cb to
67fc897
Compare
2d74b58 to
0382ced
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md
[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...
(EN_EXCESSIVE_EXCLAMATION)
[grammar] ~70-~70: Ensure spelling is correct
Context: ...es already pre-installed. There is no guarentee that your custom run YAML will nessesar...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarentee that your custom run YAML will nessesarily work with the ...
(QB_NEW_EN_HYPHEN)
[grammar] ~70-~70: Ensure spelling is correct
Context: ...uarentee that your custom run YAML will nessesarily work with the included dependencies.
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~70-~70: There might be a mistake here.
Context: ...ily work with the included dependencies.
(QB_NEW_EN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
a90eb0c to
0d59530
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
README.md (1)
59-59: Consider more concise wording for clarity.The phrase "To do so, run the image in the following way. The "path" mentioned should be the path to your custom run YAML file." is slightly verbose. Consider condensing to:
"To use a custom run YAML, run the image with the path to your file as the final argument, mounted as a volume:"
This removes redundancy and gets straight to the instruction.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
README.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
README.md
[style] ~69-~69: Using many exclamation marks might seem excessive (in this case: 3 exclamation marks for a text that’s 2062 characters long)
Context: ...k: \ <path_in_container> ``` > [!IMPORTANT] > The distribution image ship...
(EN_EXCESSIVE_EXCLAMATION)
[grammar] ~70-~70: Use a hyphen to join words.
Context: ...There is no guarantee that your custom run YAML will necessarily work with the ...
(QB_NEW_EN_HYPHEN)
[grammar] ~70-~70: There might be a mistake here.
Context: ...ily work with the included dependencies.
(QB_NEW_EN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-push (linux/amd64)
- GitHub Check: Summary
🔇 Additional comments (1)
README.md (1)
57-70: Documentation addition is clear and well-aligned with PR objectives.The new section provides users with clear instructions on how to supply a custom
run.yamlfile, complete with a practical example and an important caveat about dependency compatibility. The spelling corrections from the previous review ("guarantee" and "necessarily") are in place.However, verify that the example command syntax (
<path_in_container>as a positional argument after the image name) reflects the actual container behavior with the new entrypoint mechanism. This assumes the container ENTRYPOINT is a script or binary that accepts the path as a CMD argument—which aligns with the PR objective to decouple the entrypoint from the run.yaml path.Confirm that:
- The new entrypoint script or binary correctly interprets the
<path_in_container>argument- The default behavior (when no custom path is provided) still uses
/opt/app-root/run.yaml- The container can run without arguments and still work with the default run.yaml
You may find this easier to verify once the container images are built or if you can review the entrypoint implementation (likely in
distribution/build.pyor a shell script entrypoint).
cdoern
left a comment
There was a problem hiding this comment.
If the dependencies we package remain locked: that pretty much only allows for the same set of providers but with different config options or maybe different server level config options, right?
Not sure if this is possible or worth it, but if the deps are locked, it might make sense to validate that people are not changing the providers or introducing a provider with different dependencies. Otherwise this feature will commonly break
Correct
See the README changes |
are you pointing me to the readme where it says I am suggesting a simple diff against the provided run.yaml to check if the providers changed and if the new providers require net-new dependencies. This to me, seems like introducing a feature which will more often than not cause failures so some sort of guardrails is advisable IMO. Let me know if this makes sense and is possible! Thanks |
Why do we want to avoid them? This is just to let people try their own run YAMLs, we aren't invested into making sure they work. That's why I pointed you here.
It's really more of a utility for development - any user trying to actually use this as a production server (either standalone or via the operator) should be using the official run YAML |
Fair, all I am saying that more often than not this'll break without manual manipulation of the dependencies in the container |
True, but note we are already doing these custom run YAMLs from the operator - note this comment llamastack/llama-stack-k8s-operator#171 (comment) that led to the issue that this PR is resolving. This simply allows us to move the functionality from the operator to the distro image, which is preferred. We can iterate with some additional checking like you've mentioned, but this is more focused on the movement of that functionality and removing some hardcoded things that really shouldn't be hardcoded. |
|
@nathan-weinberg do we know what are the common customizations users are expecting to perform by following this path ? For example, I could see the below example you linked, I was imploring to see, if we have any insight into any broader customizations we can test proactively ?
|
The customization is just that you can provide an alternative assuming you have a |
803859c to
ffa6c23
Compare
3e112b5 to
8ee6257
Compare
|
@nathan-weinberg is this downstream yet? |
|
no |
this commit adds an entrypoint shell script that aligns with the upstream containers it allows users to pass a custom run.yaml if they so choose while keeping the official run.yaml we ship with the distro image as a default Signed-off-by: Nathan Weinberg <[email protected]>
8ee6257 to
1d07eff
Compare
skamenan7
left a comment
There was a problem hiding this comment.
Looks good to me with small nits.
| if [ -n "$RUN_CONFIG_PATH" ] && [ -f "$RUN_CONFIG_PATH" ]; then | ||
| exec llama stack run "$RUN_CONFIG_PATH" "$@" | ||
| fi | ||
|
|
There was a problem hiding this comment.
Thanks Nathan for this PR. few small nits.
If RUN_CONFIG_PATH is set but file doesn’t exist, container silently falls through to default config. Add explicit error or a clear message in the logs might help.
| @@ -0,0 +1,12 @@ | |||
| #!/bin/sh | |||
| set -e | |||
There was a problem hiding this comment.
good idea to use instead to catch pipeline errors too.
set -euo pipefail
Downstream first please! |
is it urgent, or we wait till @nathan-weinberg is up & running ? |
There is considerably more testing here, so IMO it makes more sense to get consensus here where we have a better idea if the change will break the image Will open downstream MR as well, but I want to get consensus/merge here first because maintaining/updating two branches concurrently is not a good use of time! |
…s-1236 chore: Update EA2 wheel tags to 1236
What does this PR do?
this commit changes the server start command to be dynamically generated based on Llama Stack version
it allows users to pass a custom run.yaml if they so choose will keeping the official run.yaml we ship with the distro image as a default
Closes #83
Summary by CodeRabbit
New Features
Documentation