This guide covers testing the Streamlit UI for NeuralNav.
- Ollama running with llama3.1:8b model
- FastAPI backend running on http://localhost:8000
- Streamlit and dependencies installed in virtual environment
# Activate virtual environment
cd backend
source venv/bin/activate
# Install UI dependencies (if not already installed)
pip install streamlit requests# From project root
scripts/run_ui.sh# From project root
cd backend
source venv/bin/activate
streamlit run ../ui/app.pyNote: On first run, Streamlit may prompt for an email address. Just press Enter to skip.
The UI will start on http://localhost:8501
- Streamlit app starts without errors
- Header displays "NeuralNav" with app icon
- Sidebar shows app title and navigation
- Two-column layout visible (Conversation | Recommendation)
- Chat input field is present
- No connection errors in UI
- Health check passes (check backend logs)
Test each example prompt button in the sidebar:
-
Example 1: "Customer service chatbot for 5000 users, low latency critical"
- Should recommend: Mistral 7B on A100-80GB
- Cost: ~$3,285/month
- SLO Status: ✅ MEETS SLO
-
Example 2: "Code generation assistant for 500 developers, quality over speed"
- Should recommend: Mistral 7B on A10G
- Cost: ~$730/month
- SLO Status: ✅ MEETS SLO
-
Example 3: "Document summarization pipeline, high throughput, cost efficient"
- Should recommend: Granite 8B on L4
- Cost: ~$365/month
- SLO Status: ✅ MEETS SLO
- User messages appear in chat
- Assistant responses appear with recommendation summary
- Spinner shows during API call
- Messages persist in chat history
Test all tabs when a recommendation is displayed:
- SLO status badge displays correctly (✅ MEETS SLO or
⚠️ DOES NOT MEET SLO) - Model name and ID shown
- GPU configuration displayed (count, type, tensor parallel, replicas)
- Key metrics cards show: TTFT, TPOT, E2E, Throughput
- Reasoning section displays
- Use case and requirements shown
- Traffic profile values displayed
- SLO targets displayed
- "✏️ Enable Editing" button works
- Fields become editable when editing mode enabled
- "💾 Save Changes" and "❌ Cancel" buttons work in edit mode
- TTFT metrics shown (p50, p90, p99)
- TPOT metrics shown (p50, p90, p99)
- E2E latency metrics shown (p50, p90, p99)
- Throughput metrics shown (QPS, tokens/sec)
- Delta values vs targets displayed
- Hourly cost displayed
- Monthly cost displayed
- GPU configuration details shown
- Cost assumptions info box visible
- "Generate Deployment YAML" button shows (Sprint 4 placeholder)
- "Deploy to Kubernetes" button shows (Sprint 6 placeholder)
- "🔄 New Conversation" button clears state and reloads
- Switching between tabs works smoothly
- Custom CSS styling applied (professional theme, metric cards)
- Responsive layout works on different screen sizes
Test error scenarios:
- Backend not running → Shows connection error message
- Invalid prompt → Shows appropriate error
- API timeout → Shows timeout error
Try these additional prompts to test the system:
-
High-volume scenario:
- "I need to serve 10,000 concurrent users with a recommendation engine. Latency is very important - users expect results in under 300ms."
- Should recommend higher-end GPU configuration
-
Cost-sensitive scenario:
- "Small team of 50 developers need a coding assistant. Budget is very limited, quality can be moderate."
- Should recommend cost-effective configuration (L4 GPUs)
-
Quality-focused scenario:
- "Building a medical diagnosis assistant for 200 doctors. Accuracy is critical, budget is flexible, latency can be 1-2 seconds."
- Should recommend larger model with better quality
- Editing specifications doesn't trigger re-recommendation (Sprint 4)
- YAML generation not implemented (Sprint 4)
- Kubernetes deployment not implemented (Sprint 6)
- Conversation history not persisted across sessions
- No authentication/user management
Sprint 3 is successful if:
- ✅ UI loads without errors
- ✅ All 3 example scenarios work end-to-end
- ✅ Recommendation details display correctly in all tabs
- ✅ Edit mode toggles properly
- ✅ Chat interface is intuitive and functional
- ✅ No connection errors when backend is running
# Check virtual environment
which python # Should point to backend/venv
# Reinstall streamlit
pip install --upgrade streamlit# Check FastAPI is running
curl http://localhost:8000/health
# Should return: {"status":"healthy","service":"ai-pre-deployment-assistant"}- Check backend logs for errors
- Verify Ollama is running:
curl http://localhost:11434/api/tags - Test backend directly:
curl -X POST http://localhost:8000/api/v1/recommend \ -H "Content-Type: application/json" \ -d '{"message": "test chatbot for 100 users"}'
# Clear Streamlit cache
streamlit cache clear