Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ repos:
always_run: true
files: ^distribution/.*$
additional_dependencies:
- git+https://github.com/opendatahub-io/llama-stack.git@v0.3.0rc3+rhai0
- uv>=0.9.0

- id: doc-gen
name: Distribution Documentation
Expand Down
27 changes: 14 additions & 13 deletions distribution/Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,17 @@
# Generated by: distribution/build.py
FROM registry.access.redhat.com/ubi9/python-312@sha256:95ec8d3ee9f875da011639213fd254256c29bc58861ac0b11f290a291fa04435
WORKDIR /opt/app-root
RUN pip install uv
RUN pip install sqlalchemy # somehow sqlalchemy[asyncio] is not sufficient
RUN pip install --upgrade \
RUN uv pip install --upgrade \
'kfp-kubernetes==2.14.6' \
'pyarrow>=21.0.0' \
'botocore==1.35.88' \
'boto3==1.35.88' \
'aiobotocore==2.16.1' \
'ibm-cos-sdk-core==2.14.2' \
'ibm-cos-sdk==2.14.2'
RUN pip install \
RUN uv pip install \
'datasets>=4.0.0' \
'mcp>=1.8.1' \
'pymilvus[milvus-lite]>=2.4.10' \
Expand Down Expand Up @@ -48,17 +49,17 @@ RUN pip install \
tqdm \
transformers \
uvicorn
RUN pip install \
llama_stack_provider_lmeval==0.3.1
RUN pip install \
llama_stack_provider_ragas==0.4.2
RUN pip install \
llama_stack_provider_ragas[remote]==0.4.2
RUN pip install \
llama_stack_provider_trustyai_fms==0.2.3
RUN pip install 'torchao>=0.12.0' --extra-index-url https://download.pytorch.org/whl/cpu torch torchvision
RUN pip install --no-deps sentence-transformers
RUN pip install --no-cache --no-deps git+https://github.com/opendatahub-io/llama-stack.git@v0.3.0rc3+rhai0
RUN uv pip install \
llama_stack_provider_lmeval==0.4.1
RUN uv pip install \
llama_stack_provider_ragas==0.5.1
RUN uv pip install \
llama_stack_provider_ragas[remote]==0.5.1
RUN uv pip install \
llama_stack_provider_trustyai_fms==0.3.0
RUN uv pip install --extra-index-url https://download.pytorch.org/whl/cpu 'torchao>=0.12.0' torch torchvision
RUN uv pip install --no-deps sentence-transformers
RUN uv pip install --no-cache --no-deps git+https://github.com/opendatahub-io/llama-stack.git@main
RUN mkdir -p ${HOME}/.llama ${HOME}/.cache
COPY distribution/run.yaml ${APP_ROOT}/run.yaml
COPY --chmod=755 distribution/entrypoint.sh ${APP_ROOT}/entrypoint.sh
Expand Down
1 change: 1 addition & 0 deletions distribution/Containerfile.in
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
FROM registry.access.redhat.com/ubi9/python-312@sha256:95ec8d3ee9f875da011639213fd254256c29bc58861ac0b11f290a291fa04435
WORKDIR /opt/app-root

RUN pip install uv
RUN pip install sqlalchemy # somehow sqlalchemy[asyncio] is not sufficient
{dependencies}
{llama_stack_install_source}
Expand Down
12 changes: 6 additions & 6 deletions distribution/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,19 @@

This image contains the official Open Data Hub Llama Stack distribution, with all the packages and configuration needed to run a Llama Stack server in a containerized environment.

The image is currently shipping with the Open Data Hub version of Llama Stack version [0.3.0rc3+rhai0](https://github.com/opendatahub-io/llama-stack/releases/tag/v0.3.0rc3+rhai0)
The image is currently shipping with the Open Data Hub version of Llama Stack version [57b22a0](https://github.com/opendatahub-io/llama-stack/commit/57b22a07f798c09713ee80836306ff1584c376fa)

You can see an overview of the APIs and Providers the image ships with in the table below.

| API | Provider | External? | Enabled by default? | How to enable |
|-----|----------|-----------|---------------------|---------------|
| agents | inline::meta-reference | No | ✅ | N/A |
| batches | inline::reference | No | ✅ | N/A |
| datasetio | inline::localfs | No | ✅ | N/A |
| datasetio | remote::huggingface | No | ✅ | N/A |
| eval | inline::trustyai_ragas | Yes (version 0.4.2) | ❌ | Set the `EMBEDDING_MODEL` environment variable |
| eval | remote::trustyai_lmeval | Yes (version 0.3.1) | ✅ | N/A |
| eval | remote::trustyai_ragas | Yes (version 0.4.2) | ❌ | Set the `KUBEFLOW_LLAMA_STACK_URL` environment variable |
| eval | inline::trustyai_ragas | Yes (version 0.5.1) | ❌ | Set the `EMBEDDING_MODEL` environment variable |
| eval | remote::trustyai_lmeval | Yes (version 0.4.1) | ✅ | N/A |
| eval | remote::trustyai_ragas | Yes (version 0.5.1) | ❌ | Set the `KUBEFLOW_LLAMA_STACK_URL` environment variable |
| files | inline::localfs | No | ✅ | N/A |
| inference | inline::sentence-transformers | No | ✅ | N/A |
| inference | remote::azure | No | ❌ | Set the `AZURE_API_KEY` environment variable |
Expand All @@ -24,11 +25,10 @@ You can see an overview of the APIs and Providers the image ships with in the ta
| inference | remote::vertexai | No | ❌ | Set the `VERTEX_AI_PROJECT` environment variable |
| inference | remote::vllm | No | ❌ | Set the `VLLM_URL` environment variable |
| inference | remote::watsonx | No | ❌ | Set the `WATSONX_API_KEY` environment variable |
| safety | remote::trustyai_fms | Yes (version 0.2.3) | ✅ | N/A |
| safety | remote::trustyai_fms | Yes (version 0.3.0) | ✅ | N/A |
| scoring | inline::basic | No | ✅ | N/A |
| scoring | inline::braintrust | No | ✅ | N/A |
| scoring | inline::llm-as-judge | No | ✅ | N/A |
| telemetry | inline::meta-reference | No | ✅ | N/A |
| tool_runtime | inline::rag-runtime | No | ✅ | N/A |
| tool_runtime | remote::brave-search | No | ✅ | N/A |
| tool_runtime | remote::model-context-protocol | No | ✅ | N/A |
Expand Down
154 changes: 108 additions & 46 deletions distribution/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,10 @@
import subprocess
import sys
import os
import shlex
from pathlib import Path

CURRENT_LLAMA_STACK_VERSION = "0.3.0rc3+rhai0"
CURRENT_LLAMA_STACK_VERSION = "main"
LLAMA_STACK_VERSION = os.getenv("LLAMA_STACK_VERSION", CURRENT_LLAMA_STACK_VERSION)
Comment on lines +17 to 18
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

Verify version change from pre-release to "main" branch.

CURRENT_LLAMA_STACK_VERSION changed from "0.3.0rc3+rhai0" to "main". This switches from a specific pre-release version to tracking the main branch, which could introduce instability if the upstream main branch has breaking changes.

Consider:

  1. Is your team prepared to handle potential breaking changes from upstream main?
  2. Should you use a specific commit SHA instead for reproducible builds?
  3. Update documentation to reflect this tracking strategy

🌐 Web query:

What are the best practices for pinning dependencies to git branches vs tags in Python projects

💡 Result:

Short answer: pin immutable refs (tags or commit hashes) for production / reproducible builds; use branches only for tracking live development or when you deliberately want moving targets — but validate them with CI and lockfiles. Key practices:

  • Prefer tags (semantic-version tags) or exact commit SHAs for reproducible builds — tags are stable release markers, SHAs are completely immutable. Use these in git: URLs (or Poetry/Pipenv rev/tag) when installing from VCS. [1][2][3]
  • Only use branches when you need a moving target (e.g., testing against a library’s main/next branch). Treat branch pins as risky for production because they change under you. If you must use a branch, pin it in CI and test frequently. [2][4]
  • Always maintain and commit a lock file (poetry.lock, Pipfile.lock, requirements.txt with hashes) so CI and colleagues reproduce identical dependency trees even if manifests point at ranges or branches. Lockfiles + VCS-tagged deps give best reproducibility. [5][1]
  • Prefer semantic version tags for published packages and keep a clear tagging policy (vMAJOR.MINOR.PATCH, annotated tags, changelog). This supports automation and traceability. [4][3]
  • Automate pin maintenance and security updates (Dependabot, Renovate, CI jobs). Configure rules to update pinned git-tag or SHA references (or open PRs when a new tag appears) and always run your test suite before merging. [6][7]
  • When referencing VCS deps in manifests, be explicit: use @tag or @ (or Poetry’s tag/rev keys) rather than leaving the default branch unspecified. Example: requests @ git+https://github.com/owner/requests.git@v1.2.3 or @38eb5d3b. [1][2]
  • Record provenance and release notes: annotate tags and keep changelogs so consumers know what each pinned tag/sha represents. Use CI to publish artifacts from tags so releases are reproducible. [4][3]
  • Policy checklist for projects:
    • Production packages: pin to tag or commit SHA + commit lockfile. [1][2]
    • Development/test only: branch pins acceptable but run frequent CI + automated update checks. [2][6]
    • Use automated tools (Dependabot/Renovate) to surface updates and security fixes; review & test before updating pins. [6][7]

Selected sources:

  • Poetry git dependency docs (how to pin branch/tag/rev). [1]
  • Pipenv best-practices / lockfile guidance. [5]
  • Why use git tags / immutability and SemVer guidance. [4][3]
  • General recommendation to pin for reproducibility (community discussion). [8]
  • GitHub Dependabot / maintaining dependencies guidance (automation & update policy). [6]

If you want, I can: (a) show concrete manifest examples for pip/Poetry/Pipenv, or (b) draft a short CI job that enforces and tests git-pinned deps.


I'll investigate the codebase context to understand the implications of this version change and whether there are existing mitigations (lockfiles, CI validation, etc.).


🏁 Script executed:

#!/bin/bash
# Investigate build.py and version usage patterns

# 1. Check build.py context around lines 17-18
echo "=== build.py context (lines 10-25) ==="
head -30 distribution/build.py | tail -20

# 2. Check for lockfiles
echo -e "\n=== Looking for lockfiles ==="
find . -maxdepth 2 -type f \( -name "*.lock" -o -name "Pipfile*" -o -name "poetry.lock" \) 2>/dev/null | head -10

# 3. Check for CI configuration
echo -e "\n=== CI configuration files ==="
find . -maxdepth 2 -type f \( -name ".github" -o -name "*.yml" -o -name "*.yaml" \) 2>/dev/null | grep -E "(workflows|\.github)" | head -10

# 4. Search for LLAMA_STACK_VERSION usage
echo -e "\n=== LLAMA_STACK_VERSION usage ==="
grep -r "LLAMA_STACK_VERSION" --include="*.py" 2>/dev/null | head -15

# 5. Check README or docs for version strategy
echo -e "\n=== README mention of versions ==="
grep -i "version\|main\|branch" README.md 2>/dev/null | head -10

Length of output: 1761



🏁 Script executed:

#!/bin/bash
# Check how LLAMA_STACK_VERSION is actually used in installation logic

# 1. Find and show install_llama_stack_from_source function
echo "=== install_llama_stack_from_source implementation ==="
grep -n -A 20 "def install_llama_stack_from_source" distribution/build.py

# 2. Check is_install_from_source
echo -e "\n=== is_install_from_source implementation ==="
grep -n -A 10 "def is_install_from_source" distribution/build.py

# 3. Look for any git clone or VCS-based installation
echo -e "\n=== Git/VCS installation patterns ==="
grep -n "git\|clone\|vcs\|@git" distribution/build.py | head -20

# 4. Check if there's a full requirements file or setup.py
echo -e "\n=== Requirements files ==="
find distribution -name "requirements*.txt" -o -name "setup.py" -o -name "setup.cfg"

# 5. Show full BASE_REQUIREMENTS and PINNED_DEPENDENCIES
echo -e "\n=== Full BASE_REQUIREMENTS and PINNED_DEPENDENCIES ==="
grep -n -A 50 "BASE_REQUIREMENTS" distribution/build.py | head -60

Length of output: 4378


🏁 Script executed:

#!/bin/bash
# Check how LLAMA_STACK_VERSION is actually used in installation logic

# 1. Find and show install_llama_stack_from_source function
echo "=== install_llama_stack_from_source implementation ==="
grep -n -A 20 "def install_llama_stack_from_source" distribution/build.py

# 2. Check is_install_from_source
echo -e "\n=== is_install_from_source implementation ==="
grep -n -A 10 "def is_install_from_source" distribution/build.py

# 3. Look for any git clone or VCS-based installation
echo -e "\n=== Git/VCS installation patterns ==="
grep -n "git\|clone\|vcs\|@git" distribution/build.py | head -20

# 4. Check if there's a full requirements file or setup.py
echo -e "\n=== Requirements files ==="
find distribution -name "requirements*.txt" -o -name "setup.py" -o -name "setup.cfg"

# 5. Show full BASE_REQUIREMENTS and PINNED_DEPENDENCIES
echo -e "\n=== Full BASE_REQUIREMENTS and PINNED_DEPENDENCIES ==="
grep -n -A 50 "BASE_REQUIREMENTS" distribution/build.py | head -60

Length of output: 4378


Code correctly routes "main" to git-based installation; review concerns remain valid but not critical.

The code detects that "main" is not a version string (has no dots) and routes it to git installation: git+https://github.com/opendatahub-io/llama-stack.git@main. This works correctly, so the invalid pip syntax in BASE_REQUIREMENTS is bypassed.

However, the review comment's concerns are valid:

  • No lockfile or commit-level pinning: Each build pulls the latest from main, creating reproducibility gaps compared to pinning specific commit SHAs (per best practices: immutable refs (tags or commit hashes) are preferred for reproducible builds; branches should only be used when deliberately tracking a moving target).
  • Breaking changes risk: Upstream main can introduce breaking changes between builds.
  • Documentation needed: The build strategy (main branch tracking) is not documented; the README mentions SHA-based image tags but doesn't clarify version pinning strategy.

The three original suggestions remain sound:

  1. Breaking change preparation: Ensure the team monitors upstream main regularly.
  2. Consider commit SHA pinning: More reproducible than branch tracking; aligns with distribution best practices.
  3. Update documentation: Clarify version strategy in README or build docs.
🤖 Prompt for AI Agents
In distribution/build.py around lines 17-18, CURRENT_LLAMA_STACK_VERSION is set
to "main" which causes builds to pull the moving branch (git+...@main) and
creates non-reproducible images and risk of breakages; change the build flow to
default to a pinned immutable ref (commit SHA or tag) instead of "main", add an
environment/CI mechanism to inject a specific commit SHA for
LLAMA_STACK_VERSION, update README/build docs to state whether builds track
branch vs pinned SHAs, and add a CI/alerting step or checklist to regularly
monitor upstream main if branch tracking is intentionally used.

BASE_REQUIREMENTS = [
f"llama-stack=={LLAMA_STACK_VERSION}",
Expand All @@ -30,7 +31,7 @@
"'ibm-cos-sdk==2.14.2'",
]

source_install_command = """RUN pip install --no-cache --no-deps git+https://github.com/opendatahub-io/llama-stack.git@v{llama_stack_version}"""
source_install_command = """RUN uv pip install --no-cache --no-deps git+https://github.com/opendatahub-io/llama-stack.git@{llama_stack_version}"""


def get_llama_stack_install(llama_stack_version):
Expand All @@ -47,10 +48,10 @@ def is_install_from_source(llama_stack_version):
return "." not in llama_stack_version or "+rhai" in llama_stack_version


def check_llama_installed():
def check_package_installed(package_name):
"""Check if llama binary is installed and accessible."""
if not shutil.which("llama"):
print("Error: llama binary not found. Please install it first.")
if not shutil.which(package_name):
print(f"Error: {package_name} not found. Please install it first.")
sys.exit(1)


Expand Down Expand Up @@ -87,9 +88,32 @@ def check_llama_stack_version():
print("Continuing without version validation...")


def install_llama_stack_from_source(llama_stack_version):
"""Install llama-stack from source using git."""
print("installing llama-stack from source...")
try:
result = subprocess.run(
f"uv pip install git+https://github.com/opendatahub-io/llama-stack.git@{llama_stack_version}",
shell=True,
check=True,
capture_output=True,
text=True,
)
# Print stdout if there's any output
if result.stdout:
print(result.stdout)
except subprocess.CalledProcessError as e:
print(f"Error installing llama-stack: {e}")
if e.stdout:
print(f"stdout: {e.stdout}")
if e.stderr:
print(f"stderr: {e.stderr}")
sys.exit(1)


def get_dependencies():
"""Execute the llama stack build command and capture dependencies."""
cmd = "llama stack build --config distribution/build.yaml --print-deps-only"
cmd = "llama stack list-deps distribution/build.yaml"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify the new command syntax works correctly.

The command changed from llama stack build --config ... --print-deps-only to llama stack list-deps ..., removing the --config flag. Ensure this new command:

  • Exists in the version being used
  • Accepts the config path as a positional argument
  • Returns the expected output format

🏁 Script executed:

#!/bin/bash
# Description: Verify the new list-deps command exists and check its output format

# Check if the command is available and what format it returns
llama stack list-deps --help || echo "Error: list-deps command not found"

# Try running the command to see the output format
if [ -f "distribution/build.yaml" ]; then
  echo "=== Testing list-deps command output ==="
  llama stack list-deps distribution/build.yaml | head -20
fi

Length of output: 164


I'll investigate this further by examining the codebase and checking for documentation on the command change.

Good news: the llama stack list-deps command exists in the official Llama Stack documentation. Now I need to verify whether it accepts a file path as an argument. Let me search for more specific usage examples.

I found important documentation. Let me now examine the actual code to understand the context better.


🏁 Script executed:

#!/bin/bash
# Find and examine the get_dependencies function in distribution/build.py

rg -n "get_dependencies" --type=py -A 10 -B 5

Length of output: 2004


Fix the command syntax: llama stack list-deps requires a distribution name, not a file path.

Official documentation shows llama stack list-deps takes a distribution name (e.g., <distro-name>), not a file path. Examples from PyPI show usage as llama stack list-deps meta-reference-gpu or llama stack list-deps starter.

The current code at line 93 uses llama stack list-deps distribution/build.yaml, which will fail. Either:

  • Pass a distribution name instead of a file path, or
  • Revert to the original llama stack build --config distribution/build.yaml --print-deps-only syntax if config file support is needed

Also update the docstring on line 92 from "Execute the llama stack build command" to reflect the new command being used.

try:
result = subprocess.run(
cmd, shell=True, capture_output=True, text=True, check=True
Expand All @@ -101,56 +125,91 @@ def get_dependencies():
no_cache = []

for line in result.stdout.splitlines():
if line.strip().startswith("uv pip"):
# Split the line into command and packages
parts = line.replace("uv ", "RUN ", 1).split(" ", 3)
line = line.strip()
if not line: # Skip empty lines
continue

# Handle both "uv pip" format and direct package list format
if line.startswith("uv pip"):
# Legacy format: "uv pip install ..."
line = line.replace("uv ", "RUN ", 1)
parts = line.split(" ", 3)
if len(parts) >= 4: # We have packages to sort
cmd_parts = parts[:3] # "RUN pip install"
packages = sorted(
set(parts[3].split())
) # Sort the package names and remove duplicates

# Add quotes to packages with > or < to prevent bash redirection
packages = [
f"'{package}'"
if (">" in package or "<" in package)
else package
for package in packages
]

# Modify pymilvus package to include milvus-lite extra
packages = [
package.replace("pymilvus", "pymilvus[milvus-lite]")
if "pymilvus" in package and "[milvus-lite]" not in package
else package
for package in packages
]
packages = sorted(set(packages))

# Determine command type and format accordingly
if ("--index-url" in line) or ("--extra-index-url" in line):
full_cmd = " ".join(cmd_parts + [" ".join(packages)])
torch_deps.append(full_cmd)
elif "--no-deps" in line:
full_cmd = " ".join(cmd_parts + [" ".join(packages)])
no_deps.append(full_cmd)
elif "--no-cache" in line:
full_cmd = " ".join(cmd_parts + [" ".join(packages)])
no_cache.append(full_cmd)
else:
formatted_packages = " \\\n ".join(packages)
full_cmd = f"{' '.join(cmd_parts)} \\\n {formatted_packages}"
standard_deps.append(full_cmd)
packages_str = parts[3]
else:
standard_deps.append(" ".join(parts))
continue
else:
# New format: just packages, possibly with flags
cmd_parts = ["RUN", "uv", "pip", "install"]
packages_str = line

# Parse packages and flags from the line
# Use shlex.split to properly handle quoted package names
parts_list = shlex.split(packages_str)
packages = []
flags = []
extra_index_url = None

i = 0
while i < len(parts_list):
if parts_list[i] == "--extra-index-url" and i + 1 < len(parts_list):
extra_index_url = parts_list[i + 1]
flags.extend([parts_list[i], parts_list[i + 1]])
i += 2
elif parts_list[i] == "--index-url" and i + 1 < len(parts_list):
flags.extend([parts_list[i], parts_list[i + 1]])
i += 2
elif parts_list[i] in ["--no-deps", "--no-cache"]:
flags.append(parts_list[i])
i += 1
else:
packages.append(parts_list[i])
i += 1

# Sort and deduplicate packages
packages = sorted(set(packages))

# Add quotes to packages with > or < to prevent bash redirection
packages = [
f"'{package}'" if (">" in package or "<" in package) else package
for package in packages
]

# Modify pymilvus package to include milvus-lite extra
packages = [
package.replace("pymilvus", "pymilvus[milvus-lite]")
if "pymilvus" in package and "[milvus-lite]" not in package
else package
for package in packages
]
packages = sorted(set(packages))
Comment on lines +180 to +187
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

Verify and document the pymilvus package modification.

The code automatically replaces pymilvus with pymilvus[milvus-lite], which is a significant functional change that:

  • Adds the milvus-lite extra to all pymilvus installations
  • Could affect package size and dependencies
  • May not be necessary or desired in all deployment scenarios

Additionally, note that line 164 sorts the packages again after they were already sorted on line 149, which is redundant.

Consider:

  • Documenting why this extra is required
  • Making this behavior configurable if different deployment targets need different extras
  • Removing the redundant sort on line 164

🌐 Web query:

Is pymilvus[milvus-lite] the recommended way to install pymilvus, and what are the differences from the base package?

💡 Result:

Short answer: Yes — the Milvus docs recommend installing pymilvus (with the milvus‑lite extra or the bundled milvus‑lite) for a local/single‑file "Milvus Lite" experience. Use either pip install -U pymilvus (recent pymilvus releases include milvus‑lite) or pip install -U "pymilvus[milvus-lite]" to explicitly request the extra. [1][2][3]

What changes vs. the base package

  • pymilvus (base): the Python SDK that talks to a remote Milvus server (Standalone/Distributed/Cloud). If you install plain pymilvus versions before milvus‑lite was bundled, milvus‑lite may not be present. [2]
  • pymilvus[milvus-lite] (extra) or pymilvus versions >= 2.4.2: also installs the milvus‑lite implementation (an embedded, SQLite‑backed local Milvus server), its CLI/tools (milvus-lite), and additional dependencies needed to run Milvus Lite locally (suitable for prototyping, demos, notebooks). Milvus docs note Milvus Lite is not recommended for production. [1][2][3]

Recommendation: For local development/prototyping, install either pip install -U pymilvus (if using a recent release that bundles milvus‑lite) or pip install -U "pymilvus[milvus-lite]" to be explicit. For production or large‑scale deployments use a standalone/distributed Milvus server instead. [1][2][3]

Sources:
[1] Milvus docs — Run Milvus Lite locally (installation notes).
[2] Milvus docs (v2.4.x) — milvus‑lite included in pymilvus >= 2.4.2.
[3] milvus‑lite PyPI (usage and CLI info).


I have the web search results clarifying that pymilvus[milvus-lite] is indeed a recommended approach. Now I need to examine the build.py file to verify the redundant sort claim and understand the deployment context.


---

🏁 Script executed:

```shell
#!/bin/bash
# View lines 140-170 of distribution/build.py to see sorting context
fd -t f "build.py" --path "*/distribution/*" -x cat {} | head -170 | tail -40

Length of output: 233



---

🏁 Script executed:

```shell
#!/bin/bash
# Find and view distribution/build.py around lines 140-170
find . -name "build.py" -path "*/distribution/*" -exec sed -n '140,170p' {} \;

Length of output: 1237



---

🏁 Script executed:

```shell
#!/bin/bash
# View broader context of distribution/build.py (first 80 lines + lines around 140-165)
sed -n '1,80p' distribution/build.py && echo "---" && sed -n '130,170p' distribution/build.py

Length of output: 4430



---

🏁 Script executed:

```shell
#!/bin/bash
# Search for environment variables, deployment flags, or scenario logic in build.py
grep -n "environment\|deploy\|production\|dev\|demo\|scenario\|ENV\|getenv" distribution/build.py | head -20

Length of output: 86


Remove redundant sort on line 164 and document the pymilvus modification.

The pymilvus[milvus-lite] extra is indeed recommended for local development and prototyping, but Milvus Lite is not recommended for production. Given this unconditional modification in a build script with no deployment-specific logic, consider:

  • Remove the redundant packages = sorted(set(packages)) on line 164 (packages were already sorted on line 149; the quote-adding and replacement operations don't create new duplicates)
  • Add a comment explaining why pymilvus[milvus-lite] is required (e.g., "for local development/demos")
  • If this script supports production deployments, make the pymilvus modification conditional or remove it entirely in favor of a standalone Milvus server
🤖 Prompt for AI Agents
In distribution/build.py around lines 157 to 164, remove the redundant final
sorting call (delete the packages = sorted(set(packages)) on line 164) because
packages were already sorted earlier and the replacement won't introduce
duplicates; add a short inline comment above the pymilvus replacement explaining
the reason (e.g., "# Add pymilvus[milvus-lite] for local development/demos; not
recommended for production"); and if this build script may be used for
production, make the pymilvus modification conditional (e.g., guard it with an
explicit flag or environment variable like USE_MILVUS_LITE) so production builds
keep the plain pymilvus package.


# Build the command based on flags
if extra_index_url or "--index-url" in flags:
# Torch dependencies with extra index URL
full_cmd = " ".join(cmd_parts + flags + packages)
torch_deps.append(full_cmd)
elif "--no-deps" in flags:
full_cmd = " ".join(cmd_parts + flags + packages)
no_deps.append(full_cmd)
elif "--no-cache" in flags:
full_cmd = " ".join(cmd_parts + flags + packages)
no_cache.append(full_cmd)
else:
# Standard dependencies with multi-line formatting
formatted_packages = " \\\n ".join(packages)
full_cmd = f"{' '.join(cmd_parts)} \\\n {formatted_packages}"
standard_deps.append(full_cmd)

# Combine all dependencies in specific order
all_deps = []

# Add pinned dependencies FIRST to ensure version compatibility
if PINNED_DEPENDENCIES:
pinned_packages = " \\\n ".join(PINNED_DEPENDENCIES)
pinned_cmd = f"RUN pip install --upgrade \\\n {pinned_packages}"
pinned_cmd = f"RUN uv pip install --upgrade \\\n {pinned_packages}"
all_deps.append(pinned_cmd)

all_deps.extend(sorted(standard_deps)) # Regular pip installs
Expand Down Expand Up @@ -203,8 +262,11 @@ def generate_containerfile(dependencies, llama_stack_install):


def main():
check_package_installed("uv")
install_llama_stack_from_source(LLAMA_STACK_VERSION)

print("Checking llama installation...")
check_llama_installed()
check_package_installed("llama")

# Do not perform version check if installing from source
if not is_install_from_source(LLAMA_STACK_VERSION):
Expand Down
10 changes: 4 additions & 6 deletions distribution/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,25 +16,23 @@ distribution_spec:
- provider_type: inline::faiss
safety:
- provider_type: remote::trustyai_fms
module: llama_stack_provider_trustyai_fms==0.2.3
module: llama_stack_provider_trustyai_fms==0.3.0
agents:
- provider_type: inline::meta-reference
eval:
- provider_type: remote::trustyai_lmeval
module: llama_stack_provider_lmeval==0.3.1
module: llama_stack_provider_lmeval==0.4.1
- provider_type: inline::trustyai_ragas
module: llama_stack_provider_ragas==0.4.2
module: llama_stack_provider_ragas==0.5.1
- provider_type: remote::trustyai_ragas
module: llama_stack_provider_ragas[remote]==0.4.2
module: llama_stack_provider_ragas[remote]==0.5.1
datasetio:
- provider_type: remote::huggingface
- provider_type: inline::localfs
scoring:
- provider_type: inline::basic
- provider_type: inline::llm-as-judge
- provider_type: inline::braintrust
telemetry:
- provider_type: inline::meta-reference
tool_runtime:
- provider_type: remote::brave-search
- provider_type: remote::tavily-search
Expand Down
Loading
Loading