Sprint 3 Testing Guide - Streamlit UI

This guide covers testing the Streamlit UI for NeuralNav.

Prerequisites

Ollama running with llama3.1:8b model
FastAPI backend running on http://localhost:8000
Streamlit and dependencies installed in virtual environment

Setup

# Activate virtual environment
cd backend
source venv/bin/activate

# Install UI dependencies (if not already installed)
pip install streamlit requests

Running the UI

Option 1: Using the startup script (Recommended)

# From project root
scripts/run_ui.sh

Option 2: Manual start

# From project root
cd backend
source venv/bin/activate
streamlit run ../ui/app.py

Note: On first run, Streamlit may prompt for an email address. Just press Enter to skip.

The UI will start on http://localhost:8501

Testing Checklist

1. UI Loads Successfully

Streamlit app starts without errors
Header displays "NeuralNav" with app icon
Sidebar shows app title and navigation
Two-column layout visible (Conversation | Recommendation)
Chat input field is present

2. Backend Connectivity

No connection errors in UI
Health check passes (check backend logs)

3. Example Prompts

Test each example prompt button in the sidebar:

Example 1: "Customer service chatbot for 5000 users, low latency critical"
- Should recommend: Mistral 7B on A100-80GB
- Cost: ~$3,285/month
- SLO Status: ✅ MEETS SLO
Example 2: "Code generation assistant for 500 developers, quality over speed"
- Should recommend: Mistral 7B on A10G
- Cost: ~$730/month
- SLO Status: ✅ MEETS SLO
Example 3: "Document summarization pipeline, high throughput, cost efficient"
- Should recommend: Granite 8B on L4
- Cost: ~$365/month
- SLO Status: ✅ MEETS SLO

4. Chat Interface

User messages appear in chat
Assistant responses appear with recommendation summary
Spinner shows during API call
Messages persist in chat history

5. Recommendation Display

Test all tabs when a recommendation is displayed:

Overview Tab

SLO status badge displays correctly (✅ MEETS SLO or ⚠️ DOES NOT MEET SLO)
Model name and ID shown
GPU configuration displayed (count, type, tensor parallel, replicas)
Key metrics cards show: TTFT, TPOT, E2E, Throughput
Reasoning section displays

Specifications Tab

Use case and requirements shown
Traffic profile values displayed
SLO targets displayed
"✏️ Enable Editing" button works
Fields become editable when editing mode enabled
"💾 Save Changes" and "❌ Cancel" buttons work in edit mode

Performance Tab

TTFT metrics shown (p50, p90, p99)
TPOT metrics shown (p50, p90, p99)
E2E latency metrics shown (p50, p90, p99)
Throughput metrics shown (QPS, tokens/sec)
Delta values vs targets displayed

Cost Tab

Hourly cost displayed
Monthly cost displayed
GPU configuration details shown
Cost assumptions info box visible
"Generate Deployment YAML" button shows (Sprint 4 placeholder)
"Deploy to Kubernetes" button shows (Sprint 6 placeholder)

6. Navigation & UX

"🔄 New Conversation" button clears state and reloads
Switching between tabs works smoothly
Custom CSS styling applied (professional theme, metric cards)
Responsive layout works on different screen sizes

7. Error Handling

Test error scenarios:

Backend not running → Shows connection error message
Invalid prompt → Shows appropriate error
API timeout → Shows timeout error

Custom Test Scenarios

Try these additional prompts to test the system:

High-volume scenario:
- "I need to serve 10,000 concurrent users with a recommendation engine. Latency is very important - users expect results in under 300ms."
- Should recommend higher-end GPU configuration
Cost-sensitive scenario:
- "Small team of 50 developers need a coding assistant. Budget is very limited, quality can be moderate."
- Should recommend cost-effective configuration (L4 GPUs)
Quality-focused scenario:
- "Building a medical diagnosis assistant for 200 doctors. Accuracy is critical, budget is flexible, latency can be 1-2 seconds."
- Should recommend larger model with better quality

Known Limitations (Sprint 3)

Editing specifications doesn't trigger re-recommendation (Sprint 4)
YAML generation not implemented (Sprint 4)
Kubernetes deployment not implemented (Sprint 6)
Conversation history not persisted across sessions
No authentication/user management

Success Criteria

Sprint 3 is successful if:

✅ UI loads without errors
✅ All 3 example scenarios work end-to-end
✅ Recommendation details display correctly in all tabs
✅ Edit mode toggles properly
✅ Chat interface is intuitive and functional
✅ No connection errors when backend is running

Troubleshooting

Streamlit won't start

# Check virtual environment
which python  # Should point to backend/venv

# Reinstall streamlit
pip install --upgrade streamlit

Backend connection fails

# Check FastAPI is running
curl http://localhost:8000/health

# Should return: {"status":"healthy","service":"ai-pre-deployment-assistant"}

Recommendation request fails

Check backend logs for errors
Verify Ollama is running: curl http://localhost:11434/api/tags

Test backend directly:

curl -X POST http://localhost:8000/api/v1/recommend \
  -H "Content-Type: application/json" \
  -d '{"message": "test chatbot for 100 users"}'

UI shows old cached data

# Clear Streamlit cache
streamlit cache clear

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sprint 3 Testing Guide - Streamlit UI

Prerequisites

Setup

Running the UI

Option 1: Using the startup script (Recommended)

Option 2: Manual start

Testing Checklist

1. UI Loads Successfully

2. Backend Connectivity

3. Example Prompts

4. Chat Interface

5. Recommendation Display

Overview Tab

Specifications Tab

Performance Tab

Cost Tab

6. Navigation & UX

7. Error Handling

Custom Test Scenarios

Known Limitations (Sprint 3)

Success Criteria

Troubleshooting

Streamlit won't start

Backend connection fails

Recommendation request fails

UI shows old cached data

FilesExpand file tree

TESTING.md

Latest commit

History

TESTING.md

File metadata and controls

Sprint 3 Testing Guide - Streamlit UI

Prerequisites

Setup

Running the UI

Option 1: Using the startup script (Recommended)

Option 2: Manual start

Testing Checklist

1. UI Loads Successfully

2. Backend Connectivity

3. Example Prompts

4. Chat Interface

5. Recommendation Display

Overview Tab

Specifications Tab

Performance Tab

Cost Tab

6. Navigation & UX

7. Error Handling

Custom Test Scenarios

Known Limitations (Sprint 3)

Success Criteria

Troubleshooting

Streamlit won't start

Backend connection fails

Recommendation request fails

UI shows old cached data