This guide is the complete end-to-end setup for a production-style deployment:
- Llama Stack server runs in-cluster as
LlamaStackDistribution - Garak scans run in-cluster via KFP (DSP v2)
| Component | Runs On |
|---|---|
| Llama Stack server | OpenShift AI / ODH cluster |
| Garak scan execution | KFP (Data Science Pipelines) |
- OpenShift cluster available
- Red Hat OpenShift AI (RHOAI) installed
- Llama Stack Operator enabled (in
DataScienceCluster,llamastackoperatorset toManaged) ocCLI access with permissions to create/update namespace resources- model endpoint details ready (
VLLM_URL,INFERENCE_MODEL, optional API token)
All manifests used here come from lsd_remote/.
oc login --token=<token> --server=<server>
export NS=tai-garak-lls
oc create namespace "$NS"
oc project "$NS"oc apply -f lsd_remote/kfp-setup/kfp.yamlWait until DSP pods are running, then capture endpoint:
oc get pods | grep -E "dspa|ds-pipeline"
export KFP_ENDPOINT="https://$(oc get routes ds-pipeline-dspa -o jsonpath='{.spec.host}')"
echo "$KFP_ENDPOINT"The Llama Stack operator creates a NetworkPolicy that restricts ingress to the Llama Stack pod. KFP pipeline pods are not in its allow-list by default, causing connection timeouts. Apply the provided NetworkPolicy to allow same-namespace pods to reach the Llama Stack service:
oc apply -f lsd_remote/kfp-setup/kfp-networkpolicy.yamlIf you skip this step, KFP pipeline pods might time out when trying to reach the Llama Stack service.
Update all hardcoded placeholders (especially namespace tai-garak-lls) in:
- lsd_remote/postgres-complete-deployment.yaml
- lsd_remote/llama_stack_distro-setup/lsd-config.yaml
- lsd_remote/llama_stack_distro-setup/lsd-secrets.yaml
- lsd_remote/llama_stack_distro-setup/lsd-garak.yaml
- lsd_remote/llama_stack_distro-setup/lsd-role.yaml
Set these carefully:
POSTGRES_HOSTto<postgres-service-name>.<namespace>.svc.cluster.localVLLM_URL,INFERENCE_MODEL,VLLM_TLS_VERIFYKUBEFLOW_PIPELINES_ENDPOINTto$KFP_ENDPOINTKUBEFLOW_NAMESPACEto your namespaceKUBEFLOW_GARAK_BASE_IMAGEis set to 'quay.io/opendatahub/odh-trustyai-garak-lls-provider-dsp:dev' (you can also use 'quay.io/rhoai/odh-trustyai-garak-lls-provider-dsp-rhel9:rhoai-3.4' if you have access)KUBEFLOW_LLAMA_STACK_URLtohttp://<lsd-name>-service.<namespace>.svc.cluster.local:8321- optional: add
KUBEFLOW_EXPERIMENT_NAME(for exampletrustyai-garak-prod) if you want runs grouped under a specific KFP experiment name. Defaults to "trustyai-garak" if not provided
- set namespace
- set
VLLM_API_TOKENwhen required by your model endpoint
- set namespace
- confirm distribution image
- ensure config/secret refs match names from
lsd-config.yamlandlsd-secrets.yaml
- set namespace in all three resources (Role, and both RoleBindings)
- verify role name (
ds-pipeline-dspa) matches your DSP install in the pipeline-management RoleBinding - verify service account name (
<lsd-name>-sa, default in this repo isllamastack-garak-distribution-sa) - the
lsd-garak-dspa-api-accessRole grants the service account permission to access the DSPA API proxy (required for KFP client connectivity through the external route)
oc apply -f lsd_remote/postgres-complete-deployment.yamlVerify:
oc get pods | grep postgres
oc get svc postgresProduction note: the provided manifest currently mounts emptyDir for PostgreSQL data in the Deployment. Replace it with a PVC-backed mount if you need durable storage across pod restarts.
Apply in this order:
oc apply -f lsd_remote/llama_stack_distro-setup/lsd-config.yaml
oc apply -f lsd_remote/llama_stack_distro-setup/lsd-secrets.yaml
oc apply -f lsd_remote/llama_stack_distro-setup/lsd-garak.yaml
oc apply -f lsd_remote/llama_stack_distro-setup/lsd-role.yamlWhy this order:
- config and secrets must exist before the distribution starts
- role binding should be ready before pipeline operations
oc get pods
oc get llamastackdistribution
oc get svc | grep llamastack
oc get routes -A | grep -i ds-pipelineInspect logs if needed:
oc logs deploy/postgres
oc describe llamastackdistribution llamastack-garak-distribution
oc get pods | grep llamastackUse one of:
- in-cluster service URL (from a Data Science Workbench in same cluster):
http://<lsd-service>.<namespace>.svc.cluster.local:8321 - local machine with port-forward:
oc port-forward svc/llamastack-garak-distribution-service 8321:8321- OpenShift route URL (external access without port-forward):
# list existing routes
oc get routes | grep -i llamastack
# if needed, expose a route for the service
oc expose svc/llamastack-garak-distribution-service --name=llamastack-garak-routeThen set:
BASE_URL="http://localhost:8321"(if port-forwarding), orBASE_URL="<in-cluster-service-url>", orBASE_URL="https://<route>"(usehttp://if your route is configured without TLS)
Open demos/guide.ipynb and run it end-to-end.
- verify
lsd-role.yamlis applied and points to the correct service account - if your environment needs token auth, uncomment and populate the optional Kubeflow token secret in
lsd-secrets.yamland corresponding env var inlsd-garak.yaml
- verify
postgresservice exists in same namespace - verify
POSTGRES_HOSTinlsd-config.yaml - verify postgres secret keys (
POSTGRES_DB,POSTGRES_USER,POSTGRES_PASSWORD)
- check for NetworkPolicies blocking traffic:
oc get networkpolicy - if pipeline pods time out reaching Llama Stack but port-forward works, apply the NetworkPolicy:
oc apply -f lsd_remote/kfp-setup/kfp-networkpolicy.yaml(see step 2 above) - verify the podSelector label in
kfp-networkpolicy.yamlmatches the Llama Stack pod:oc get pods --show-labels | grep llamastack - verify
KUBEFLOW_LLAMA_STACK_URLresolves from inside cluster - verify service name/port in
lsd-garak.yamlmatches URL configured inlsd-config.yaml