Skip to content

Commit bded357

Browse files
mpk-droidclaude
andcommitted
Add GPU support, hosted model flow, MLflow cluster injection, and cleanup scripts
- Ollama deployment now requests nvidia.com/gpu for cluster inference - Route timeout set to 300s for multi-tool agent queries - deploy-local.sh and deploy-cluster.sh auto-detect Ollama vs hosted model from BASE_URL - setup-cluster.sh prompts for local Ollama or hosted model (OpenAI, etc.) - deploy-cluster.sh injects MLflow env vars into agent pod via deployment.yaml - deploy-cluster.sh refreshes MLflow token from oc, checks prereqs (docker, envsubst) - deploy-cluster.sh and setup-cluster.sh prompt for project with option to create new - Add cleanup-local.sh for stopping LlamaStack and cleaning up - agent.py: MLflow setup wrapped in try/except for graceful failure - agent.py: load_dotenv() at module level so MLflow reads MLFLOW_TRACKING_TOKEN - README rewritten to show .env as single config driving all scripts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 814f2fb commit bded357

9 files changed

Lines changed: 347 additions & 146 deletions

File tree

agents/demo/README.md

Lines changed: 89 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,41 @@ The resulting agent recommends the best day and time window for outdoor activiti
1818

1919
## Prerequisites
2020

21-
- **Ollama** installed on your machine ([ollama.com](https://ollama.com/) or `brew install ollama`)
22-
- **NPS API Key** — free from [developer.nps.gov](https://developer.nps.gov) (sign up to get a 40-character key)
2321
- **uv** package manager ([docs.astral.sh/uv](https://docs.astral.sh/uv/))
22+
- **NPS API Key** — free from [developer.nps.gov](https://developer.nps.gov) (sign up to get a 40-character key)
23+
- **Ollama** — only if using a local model ([ollama.com](https://ollama.com/) or `brew install ollama`)
24+
25+
---
26+
27+
## How `.env` drives everything
28+
29+
The `.env` file is the single source of configuration. All scripts read from it to decide what to do:
30+
31+
```
32+
# LLM Configuration
33+
API_KEY=not-needed
34+
BASE_URL=http://localhost:8321
35+
MODEL_ID=ollama/qwen2.5:7b
36+
CONTAINER_IMAGE=not-needed
37+
38+
# National Park Service API Key
39+
NPS_API_KEY=your-nps-api-key
40+
41+
# MLflow Tracing (optional)
42+
# MLFLOW_TRACKING_URI=https://your-mlflow-gateway/mlflow
43+
# MLFLOW_TRACKING_TOKEN=your-token
44+
# MLFLOW_WORKSPACE=your-workspace
45+
# MLFLOW_ENABLE_WORKSPACES=true
46+
```
47+
48+
**The key field is `BASE_URL`:**
49+
50+
| `BASE_URL` value | What happens |
51+
|---|---|
52+
| `http://localhost:8321` | Scripts deploy Ollama + LlamaStack locally, or Ollama on the cluster |
53+
| `https://api.openai.com/v1` (or any remote URL) | Scripts skip Ollama/LlamaStack entirely — agent connects directly to the hosted model |
54+
55+
> **Model note:** `qwen2.5:7b` is recommended for reliable function calling with Ollama. Smaller models like `llama3.2:3b` struggle with multi-tool orchestration.
2456
2557
---
2658

@@ -35,8 +67,9 @@ From the demo directory, copy these files into `agents/base/langgraph_react_agen
3567
| `.env` | `.env` | All secrets and config — share securely with your team |
3668
| `deploy-local.sh` | `deploy-local.sh` | One-command local setup and run |
3769
| `deploy-cluster.sh` | `deploy-cluster.sh` | One-command cluster deployment |
38-
| `setup-cluster.sh` | `setup-cluster.sh` | Deploys Ollama on cluster |
70+
| `setup-cluster.sh` | `setup-cluster.sh` | Deploys Ollama on cluster (or configures hosted model) |
3971
| `cleanup-cluster.sh` | `cleanup-cluster.sh` | Removes all cluster resources |
72+
| `cleanup-local.sh` | `cleanup-local.sh` | Stops LlamaStack, cleans up local |
4073
| `k8s/ollama-deployment.yaml` | `k8s/ollama-deployment.yaml` | Ollama pod for the cluster |
4174
| `k8s/ollama-service.yaml` | `k8s/ollama-service.yaml` | Ollama service |
4275

@@ -48,49 +81,53 @@ mlflow>=2.19.0
4881

4982
And in `main.py`, change `recursion_limit` from `10` to `25`.
5083

51-
> **Model note:** `qwen2.5:7b` is recommended for reliable function calling. Smaller models like `llama3.2:3b` struggle with multi-tool orchestration, and `llama3.1:8b` does not produce structured tool calls through LlamaStack.
52-
5384
> **Import fix:** After copying `tools.py` and `agent.py`, replace all occurrences of `langgraph_outdoor_activity_agent` with `langgraph_react_agent_base` in the import lines.
5485
5586
---
5687

5788
## Run locally
5889

59-
### 1. Start Ollama
60-
61-
Ollama is a system-level application (not a Python package). It must be installed separately and runs outside the virtual environment.
90+
### With local Ollama (default `.env`)
6291

92+
Start Ollama in one terminal:
6393
```bash
6494
ollama serve
6595
```
6696

67-
Keep this running in its own terminal. Ollama needs to be running before the deploy script can pull models and start LlamaStack.
68-
69-
### 2. Run the deploy script
70-
71-
In a new terminal:
72-
97+
Run the agent in another terminal:
7398
```bash
7499
cd agents/base/langgraph_react_agent
75100
chmod +x deploy-local.sh
76101
./deploy-local.sh
77102
```
78103

79-
This script will:
80-
- Create a Python virtual environment and install dependencies
81-
- Pull Ollama model (`qwen2.5:7b`)
82-
- Start LlamaStack in the background
83-
- Launch the interactive agent
104+
The script detects `localhost` in `BASE_URL` and automatically:
105+
- Pulls the model from `MODEL_ID`
106+
- Starts LlamaStack
107+
- Launches the interactive agent
84108

85-
Make sure `qwen2.5:7b` is registered in `run_llama_server.yaml` under `registered_resources.models`:
109+
### With a hosted model (e.g. OpenAI)
86110

87-
```yaml
88-
- model_id: qwen2.5:7b
89-
provider_id: ollama
90-
model_type: llm
91-
metadata: { }
111+
Update `.env`:
112+
```
113+
API_KEY=sk-your-openai-key
114+
BASE_URL=https://api.openai.com/v1
115+
MODEL_ID=gpt-4o-mini
116+
```
117+
118+
Then run:
119+
```bash
120+
./deploy-local.sh
92121
```
93122

123+
The script detects a remote `BASE_URL` and skips Ollama and LlamaStack — it just installs dependencies and runs the agent directly.
124+
125+
### To change the Ollama model
126+
127+
Update these three places:
128+
- `MODEL_ID` in `.env` (e.g. `ollama/qwen2.5:7b`)
129+
- The model entry in `run_llama_server.yaml` under `registered_resources.models`
130+
94131
### Try it out
95132

96133
```
@@ -99,13 +136,20 @@ Is it safe to go running outdoors in San Francisco tomorrow morning?
99136
I want to go biking in Yosemite next weekend, any recommendations?
100137
```
101138

139+
### Clean up
140+
141+
```bash
142+
chmod +x cleanup-local.sh
143+
./cleanup-local.sh
144+
```
145+
102146
---
103147

104148
## Deploy to OpenShift cluster
105149

106150
### 1. Update `.env` for cluster
107151

108-
Set the `CONTAINER_IMAGE` to your registry. The `BASE_URL` and `MODEL_ID` will be auto-detected by the deploy script once Ollama is running on the cluster.
152+
Set `CONTAINER_IMAGE` to the registry path where the deploy script will build and push the agent image:
109153

110154
```
111155
CONTAINER_IMAGE=quay.io/your-username/langgraph-outdoor-activity-agent:latest
@@ -125,9 +169,9 @@ chmod +x setup-cluster.sh
125169
./setup-cluster.sh
126170
```
127171

128-
This will:
129-
- Deploy Ollama on the cluster and pull the `qwen2.5:7b` model
130-
- Verify NPS API key and MLflow connectivity
172+
The script prompts you to choose:
173+
- **Local Ollama** — deploys Ollama on the cluster, pulls the model, verifies NPS/MLflow
174+
- **Hosted model** — asks for `BASE_URL`, `MODEL_ID`, `API_KEY`, saves them to `.env`, skips Ollama
131175

132176
### 4. Deploy the agent
133177

@@ -136,11 +180,12 @@ chmod +x deploy-cluster.sh
136180
./deploy-cluster.sh
137181
```
138182

139-
This will:
140-
- Auto-detect the in-cluster Ollama URL (`http://ollama.<namespace>.svc.cluster.local:11434/v1`)
141-
- Build and push the Docker image
142-
- Create K8s secrets
143-
- Deploy the agent and print the route URL
183+
The script reads `.env` and:
184+
- If `BASE_URL` is `localhost` → replaces it with the in-cluster Ollama URL, checks Ollama is running
185+
- If `BASE_URL` is a remote URL → uses it directly, skips Ollama check
186+
- Builds and pushes the Docker image
187+
- Creates K8s secrets
188+
- Deploys the agent and prints the route URL
144189

145190
### 5. Test
146191

@@ -157,15 +202,11 @@ chmod +x cleanup-cluster.sh
157202
./cleanup-cluster.sh
158203
```
159204

160-
This removes the agent, Ollama, and all associated secrets from the cluster.
161-
162205
---
163206

164207
## MLflow Tracing (Optional)
165208

166-
MLflow tracing is already wired into `agent.py` — it activates automatically when `MLFLOW_TRACKING_URI` is set. No code changes needed.
167-
168-
### Enable tracing
209+
MLflow tracing is already wired into `agent.py`. It activates when `MLFLOW_TRACKING_URI` is set in `.env`. No code changes needed.
169210

170211
Uncomment the MLflow lines in `.env`:
171212

@@ -176,22 +217,9 @@ MLFLOW_WORKSPACE=your-workspace-name
176217
MLFLOW_ENABLE_WORKSPACES=true
177218
```
178219

179-
For RHOAI/OpenShift AI deployments, the tracking token is your OpenShift token (`oc whoami -t`) and the workspace matches your MLflow workspace name.
180-
181-
For cluster deployment, add to `k8s/deployment.yaml`:
182-
183-
```yaml
184-
- name: MLFLOW_TRACKING_URI
185-
value: "https://your-mlflow-gateway-url/mlflow"
186-
- name: MLFLOW_WORKSPACE
187-
value: "your-workspace-name"
188-
- name: MLFLOW_ENABLE_WORKSPACES
189-
value: "true"
190-
```
191-
192-
When set, every agent query automatically traces all tool calls, LLM requests, and responses to your MLflow instance.
193-
194-
When not set, tracing is disabled and the agent runs normally with no overhead.
220+
- The tracking token is auto-refreshed from `oc whoami -t` by the deploy scripts
221+
- On the cluster, MLflow env vars are injected into the agent pod automatically by `deploy-cluster.sh`
222+
- When not set, tracing is disabled with no overhead
195223

196224
See [MLflow LangGraph Tracing docs](https://mlflow.org/docs/latest/genai/tracing/integrations/listing/langgraph/) for details.
197225

@@ -204,12 +232,14 @@ See [MLflow LangGraph Tracing docs](https://mlflow.org/docs/latest/genai/tracing
204232
| `src/.../tools.py` | Replaced 2 dummy tools with 6 real API tools (geocoding, weather, air quality, sunrise/sunset, NPS parks, NPS alerts) |
205233
| `src/.../agent.py` | Updated tool imports, domain-specific system prompt, and MLflow tracing |
206234
| `requirements.txt` | Added `httpx>=0.27.0` and `mlflow>=2.19.0` |
207-
| `.env` | All secrets and config (LLM, NPS, MLflow) in one file |
235+
| `.env` | Single config file that drives all scripts (LLM, NPS, MLflow) |
208236
| `main.py` | Increased recursion limit from 10 to 25 |
209237
| `run_llama_server.yaml` | Added `qwen2.5:7b` to registered models |
210-
| `deploy-local.sh` | One-command local setup and run |
211-
| `deploy-cluster.sh` | One-command cluster deployment |
212-
| `setup-cluster.sh` | Pre-flight check for cluster dependencies |
238+
| `deploy-local.sh` | One-command local run (auto-detects Ollama vs hosted) |
239+
| `deploy-cluster.sh` | One-command cluster deploy (auto-detects Ollama vs hosted) |
240+
| `setup-cluster.sh` | Cluster setup (prompts for Ollama or hosted model) |
241+
| `cleanup-cluster.sh` | Removes all cluster resources |
242+
| `cleanup-local.sh` | Stops LlamaStack, cleans up local |
213243

214244
Everything else — `Dockerfile`, `k8s/`, `examples/` — stays the same.
215245

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
#!/bin/bash
2+
#
3+
# Clean up local deployment of the Outdoor Activity Agent
4+
#
5+
# Stops LlamaStack, and optionally removes the virtual environment.
6+
#
7+
# Usage:
8+
# ./cleanup-local.sh
9+
#
10+
11+
echo "=== Outdoor Activity Agent - Local Cleanup ==="
12+
echo ""
13+
14+
# Stop LlamaStack
15+
echo "--- Stopping LlamaStack ---"
16+
pkill -f "llama stack run" 2>/dev/null && echo "LlamaStack stopped" || echo "LlamaStack was not running"
17+
18+
# Remove milvus data
19+
if [ -d milvus_data ]; then
20+
rm -rf milvus_data && echo "Removed milvus_data/"
21+
fi
22+
23+
# Remove virtual environment
24+
if [ -d .venv ]; then
25+
read -p "Remove .venv? (y/N) " answer
26+
if [ "$answer" = "y" ] || [ "$answer" = "Y" ]; then
27+
rm -rf .venv && echo "Removed .venv/"
28+
else
29+
echo "Kept .venv/"
30+
fi
31+
fi
32+
33+
echo ""
34+
echo "=== Local Cleanup Complete ==="
35+
echo ""
36+
echo "Note: Ollama is still running. Stop it with: pkill ollama"

agents/demo/langgraph_outdoor_activity_agent/deploy-cluster.sh

Lines changed: 59 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,21 @@ if ! command -v oc &> /dev/null; then
2323
exit 1
2424
fi
2525

26+
if ! command -v docker &> /dev/null; then
27+
echo "ERROR: docker not installed"
28+
exit 1
29+
fi
30+
31+
if ! docker info > /dev/null 2>&1; then
32+
echo "ERROR: Docker is not running. Start Docker Desktop first: open -a Docker"
33+
exit 1
34+
fi
35+
36+
if ! command -v envsubst &> /dev/null; then
37+
echo "ERROR: envsubst not found. Install with: brew install gettext"
38+
exit 1
39+
fi
40+
2641
if ! oc whoami > /dev/null 2>&1; then
2742
echo "ERROR: Not logged into OpenShift. Run: oc login ..."
2843
exit 1
@@ -33,27 +48,59 @@ if [ ! -f .env ]; then
3348
exit 1
3449
fi
3550

51+
NAMESPACE=$(oc project -q)
52+
echo ""
53+
echo "Current project: ${NAMESPACE}"
54+
echo " [y] Deploy to this project"
55+
echo " [n] Create a new project"
56+
echo " [q] Quit"
57+
read -p "Choice: " answer
58+
if [ "$answer" = "q" ] || [ "$answer" = "Q" ]; then
59+
echo "Aborted."
60+
exit 0
61+
elif [ "$answer" = "n" ] || [ "$answer" = "N" ]; then
62+
read -p "New project name: " new_project
63+
oc new-project "$new_project" 2>/dev/null || oc project "$new_project"
64+
NAMESPACE="$new_project"
65+
elif [ "$answer" != "y" ] && [ "$answer" != "Y" ]; then
66+
echo "Aborted."
67+
exit 0
68+
fi
69+
3670
source .env
3771

38-
# Auto-detect Ollama in-cluster URL if BASE_URL is still localhost
39-
NAMESPACE=$(oc project -q)
72+
# Detect if using local Ollama or hosted model
4073
if echo "$BASE_URL" | grep -q "localhost"; then
74+
echo "Detected local Ollama configuration. Checking cluster..."
75+
76+
OLLAMA_POD=$(oc get pods -l app=ollama -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
77+
if [ -z "$OLLAMA_POD" ]; then
78+
echo "ERROR: Ollama is not running in project '${NAMESPACE}'."
79+
echo " Run ./setup-cluster.sh first to deploy Ollama and pull the model."
80+
exit 1
81+
fi
82+
echo "OK: Ollama running (pod: ${OLLAMA_POD})"
83+
84+
# Replace localhost with in-cluster Ollama URL
4185
BASE_URL="http://ollama.${NAMESPACE}.svc.cluster.local:11434/v1"
42-
echo "Auto-detected in-cluster Ollama URL: ${BASE_URL}"
43-
fi
4486

45-
# Strip ollama/ prefix from MODEL_ID (not needed when connecting directly)
46-
if echo "$MODEL_ID" | grep -q "^ollama/"; then
47-
MODEL_ID="${MODEL_ID#ollama/}"
48-
echo "Stripped ollama/ prefix from MODEL_ID: ${MODEL_ID}"
49-
fi
87+
# Strip ollama/ prefix from MODEL_ID (not needed when connecting directly)
88+
if echo "$MODEL_ID" | grep -q "^ollama/"; then
89+
MODEL_ID="${MODEL_ID#ollama/}"
90+
fi
5091

51-
# API_KEY not needed for in-cluster Ollama
52-
if [ -z "$API_KEY" ] || [ "$API_KEY" = "not-needed" ]; then
92+
# API_KEY not needed for in-cluster Ollama
5393
API_KEY="not-needed"
94+
else
95+
echo "Using hosted model at: ${BASE_URL}"
96+
fi
97+
98+
# Refresh MLflow token if logged into oc
99+
if [ -n "$MLFLOW_TRACKING_URI" ] && oc whoami > /dev/null 2>&1; then
100+
MLFLOW_TRACKING_TOKEN=$(oc whoami -t)
54101
fi
55102

56-
export CONTAINER_IMAGE BASE_URL MODEL_ID
103+
export CONTAINER_IMAGE BASE_URL MODEL_ID MLFLOW_TRACKING_URI MLFLOW_TRACKING_TOKEN MLFLOW_WORKSPACE MLFLOW_ENABLE_WORKSPACES
57104

58105
# Validate required env vars
59106
for var in BASE_URL MODEL_ID CONTAINER_IMAGE NPS_API_KEY; do
@@ -63,15 +110,6 @@ for var in BASE_URL MODEL_ID CONTAINER_IMAGE NPS_API_KEY; do
63110
fi
64111
done
65112

66-
# Check Ollama is running on cluster
67-
echo "Checking Ollama on cluster..."
68-
OLLAMA_POD=$(oc get pods -l app=ollama -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)
69-
if [ -z "$OLLAMA_POD" ]; then
70-
echo "ERROR: Ollama is not deployed. Run ./setup-cluster.sh first."
71-
exit 1
72-
fi
73-
echo "OK: Ollama running (pod: ${OLLAMA_POD})"
74-
75113
echo ""
76114
echo "Cluster: $(oc whoami --show-server)"
77115
echo "User: $(oc whoami)"

0 commit comments

Comments
 (0)