This directory contains the necessary files to build a Red Hat compatible container image for the llama-stack.
- Python >=3.11
llamaCLI tool installed:pip install llama-stack- Podman or Docker installed
The build script supports three modes:
Includes all features including TrustyAI providers that require Kubernetes/OpenShift:
./distribution/build.pyBuilds a version without Kubernetes dependencies, using Llama Guard for safety:
./distribution/build.py --standaloneBuilds a single container that supports both modes via environment variables:
./distribution/build.py --unifiedThe Containerfile is auto-generated from a template. To generate it:
- Make sure you have the
llamaCLI tool installed - Run the build script from root of this git repo with your desired mode:
./distribution/build.py [--standalone] [--unified]
This will:
- Check for the
llamaCLI installation - Generate dependencies using
llama stack build - Create a new
Containerfilewith the required dependencies
The Containerfile is auto-generated from a template. To edit it, you can modify the template in distribution/Containerfile.in and run the build script again.
NEVER edit the generated Containerfile manually.
Once the Containerfile is generated, you can build the image using either Podman or Docker:
podman build --platform linux/amd64 -f distribution/Containerfile -t llama-stack-rh .docker build -f distribution/Containerfile -t llama-stack-rh .To run the container in standalone mode without Kubernetes dependencies, set the STANDALONE environment variable:
# Using Docker
docker run -e STANDALONE=true \
-e VLLM_URL=http://host.docker.internal:8000/v1 \
-e INFERENCE_MODEL=your-model-name \
-p 8321:8321 \
llama-stack-rh
# Using Podman
podman run -e STANDALONE=true \
-e VLLM_URL=http://host.docker.internal:8000/v1 \
-e INFERENCE_MODEL=your-model-name \
-p 8321:8321 \
llama-stack-rhTo run with all features including TrustyAI providers (requires Kubernetes/OpenShift):
# Using Docker
docker run -p 8321:8321 llama-stack-rh
# Using Podman
podman run -p 8321:8321 llama-stack-rh- The generated Containerfile should not be modified manually as it will be overwritten the next time you run the build script
podman push <build-ID> quay.io/opendatahub/llama-stack:rh-distribution