Skip to content

Commit d6497ec

Browse files
authored
Merge pull request #314 from EleanorWho/ehu/server-config
Add comprehensive Ollama setup and demo documentation to README files
2 parents 9c00a4c + 0fb0dc6 commit d6497ec

2 files changed

Lines changed: 73 additions & 6 deletions

File tree

demos/01_foundations/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,21 @@
11
# Foundations
22

3+
## Prerequisites
4+
5+
Before running these demos, ensure you have a Llama Stack server running:
6+
7+
**Option 1: Local Server (Recommended for learning)**
8+
```bash
9+
# Follow the setup instructions in ../README.md to install Ollama and start the server
10+
llama stack run starter # Runs on localhost:8321
11+
```
12+
13+
**Option 2: Remote Server**
14+
If using a remote server (e.g., OpenShift AI), ensure you have:
15+
- Network access to the server
16+
- Authentication token configured in `.env` file (`LLAMA_STACK_CLIENT_API_KEY`)
17+
- Port forwarding set up if needed
18+
319
## Overview
420
This folder teaches the fundamental building blocks of Llama Stack, including client setup, chat completions, vector databases, and tool integration. These examples cover the core APIs and concepts needed to build AI applications.
521

demos/README.md

Lines changed: 57 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,21 @@ This directory contains demo examples for getting started with Llama Stack.
77
First, install [`uv`](https://docs.astral.sh/uv/getting-started/installation/), a fast Python package manager.
88

99
```bash
10+
# 0️⃣ Install Ollama (if using local inference)
11+
# - Download and install from https://ollama.com/download
12+
# - Or use your package manager (recommended for security):
13+
# - macOS: brew install ollama
14+
# - Linux: Follow instructions at https://ollama.com/download/linux
15+
# - Windows: Download installer from https://ollama.com/download/windows
16+
17+
# - Pull a model (required for inference). Use smaller models for CPU-only systems:
18+
ollama pull llama3.2:1b # 1B model - fast on CPU
19+
# OR
20+
ollama pull llama3.2:3b # 3B model - default, slower on CPU
21+
22+
# - Verify Ollama is running (should return JSON with model list):
23+
curl http://localhost:11434/api/tags
24+
1025
# 1️⃣ Create a virtual environment in the current directory (.venv)
1126
# - Use Python 3.12 explicitly
1227
# - --seed ensures pip and core packaging tools are installed in the venv
@@ -22,22 +37,58 @@ source .venv/bin/activate
2237
# - This installs the CLI (`llama`) and required core dependencies
2338
uv pip install -U llama_stack
2439

25-
# 3.5️⃣ Install or upgrade the llama-stack-client SDK
40+
# 4️⃣ Install or upgrade the llama-stack-client SDK
2641
# - This is the Python client library for interacting with a Llama Stack server
2742
# - Provides high-level APIs for inference, agents, safety, and more
2843
uv pip install -U llama-stack-client
2944

30-
# 4️⃣ Install additional dependencies required by the "starter" demo profile
45+
# 5️⃣ Install additional dependencies required by the "starter" demo profile
3146
# - `llama stack list-deps starter` prints required packages (one per line)
3247
# - `xargs -L1 pip install` installs each dependency line-by-line
3348
# - Assumes the virtual environment is active
3449
llama stack list-deps starter | xargs -L1 uv pip install
3550

36-
# 5️⃣ Run the "starter" demo using a local Ollama server
37-
# - OLLAMA_URL sets the endpoint for the Ollama model server
38-
# - This environment variable applies only to this command
39-
# - The starter demo connects to Ollama at localhost:11434
51+
# 6️⃣ Run the "starter" Llama Stack server
52+
# - This starts a LOCAL server on port 8321 (default for starter distribution)
53+
# - The server connects to Ollama at localhost:11434 for inference
54+
# - IMPORTANT: Keep this terminal open - the server runs in foreground
55+
# - The server must stay running for demos to work
4056
OLLAMA_URL=http://localhost:11434/v1 uv run llama stack run starter
57+
58+
# 7️⃣ Verify the server is running (in a NEW terminal - server must be running!)
59+
# - Open a SECOND terminal window
60+
# - Navigate to the repository directory and activate the virtual environment
61+
cd <repo-root> # Navigate to where you cloned the repo
62+
source .venv/bin/activate
63+
64+
# 8️⃣ Test the connection
65+
# - Run the client setup demo to verify server is running
66+
python -m demos.01_foundations.01_client_setup localhost 8321 # Note: port 8321 for local starter server
67+
```
68+
69+
### Troubleshooting
70+
71+
**Port already in use (8321):**
72+
```bash
73+
# Find and kill the process using port 8321
74+
lsof -i :8321
75+
kill <PID>
76+
```
77+
78+
**Server not starting:**
79+
```bash
80+
# Check if Ollama is running
81+
curl http://localhost:11434/api/tags
82+
83+
# Check if a model is pulled
84+
ollama list
85+
```
86+
87+
**Version compatibility errors:**
88+
```bash
89+
# Reinstall all packages with matching versions
90+
pip uninstall -y llama-stack llama-stack-api llama-stack-client
91+
uv pip install -U llama-stack llama-stack-client
4192
```
4293

4394
## Available Demos

0 commit comments

Comments
 (0)