Skip to content

Commit 6ab367e

Browse files
anfredetteclaude
andcommitted
Restructure architecture from 10 to 8 components
Merge Simulation & Exploration Layer into Conversational Interface Layer since specification editing and what-if analysis are UI-driven features. Reposition vLLM Simulator as a development tool rather than a core architecture component. Changes: - Update component count from 10 to 8 across all docs - Remove component numbering; reference by name for flexibility - Add Phase 1 note to DeploymentIntent schema clarification - Enhance Future Enhancements with comprehensive simulation features - Update architecture diagrams to reflect UI-driven exploration Files updated: - docs/ARCHITECTURE.md - docs/architecture-diagram.md - CLAUDE.md - README.md Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andre Fredette <afredette@redhat.com>
1 parent 130547c commit 6ab367e

File tree

4 files changed

+211
-200
lines changed

4 files changed

+211
-200
lines changed

CLAUDE.md

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -65,23 +65,26 @@ The system translates high-level user intent into technical specifications:
6565
- GPU capacity plan (e.g., "2x NVIDIA L4 GPUs, independent replicas")
6666
- Cost estimate ($800/month)
6767

68-
### The 10 Core Components
68+
### The 8 Core Components
6969

70-
1. **Conversational Interface Layer** - Streamlit UI
70+
1. **Conversational Interface Layer** - Streamlit UI with interactive exploration features
71+
- Specification review and editing
72+
- What-if analysis (Phase 2: advanced simulation)
7173
2. **Context & Intent Engine** - Extract structured specs from conversation
7274
- Use case → SLO template mapping
7375
- Auto-generate traffic profiles
7476
3. **Recommendation Engine** (3 sub-components):
7577
- Traffic Profile Generator
7678
- Model Recommendation Engine
7779
- Capacity Planning Engine
78-
4. **Simulation & Exploration Layer** - What-if analysis, spec editing
79-
5. **Deployment Automation Engine** - Generate YAML, deploy to K8s
80-
6. **Knowledge Base** - Benchmarks, SLO templates, model catalog, outcomes
81-
7. **LLM Backend** - Powers conversational AI (Ollama with llama3.1:8b)
82-
8. **Orchestration & Workflow Engine** - Coordinate multi-step flows
83-
9. **Inference Observability** - Monitor deployed models (TTFT, TPOT, GPU utilization)
84-
10. **vLLM Simulator** - GPU-free development and testing
80+
4. **Deployment Automation Engine** - Generate YAML, deploy to K8s
81+
5. **Knowledge Base** - Benchmarks, SLO templates, model catalog, outcomes
82+
6. **LLM Backend** - Powers conversational AI (Ollama with llama3.1:8b)
83+
7. **Orchestration & Workflow Engine** - Coordinate multi-step flows
84+
8. **Inference Observability** - Monitor deployed models (TTFT, TPOT, GPU utilization)
85+
86+
**Development Tools:**
87+
- **vLLM Simulator** - GPU-free development and testing (not part of core architecture)
8588

8689
### Critical Data Collections (Knowledge Base)
8790
- **Model Benchmarks**: TTFT/TPOT/throughput for (model, GPU, tensor_parallel) tuples
@@ -96,7 +99,7 @@ The system translates high-level user intent into technical specifications:
9699
**docs/ARCHITECTURE.md and docs/architecture-diagram.md must stay synchronized**:
97100
- If you change component descriptions in ARCHITECTURE.md, update architecture-diagram.md diagrams
98101
- If you add/remove components, update both files
99-
- Component numbering must match (e.g., "Component 3" in both docs)
102+
- Components are referenced by name (not numbered) for clarity and flexibility
100103

101104
### Key Architectural Decisions to Preserve
102105

README.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -88,18 +88,19 @@ The POC includes 3 pre-configured scenarios (see [data/demo_scenarios.json](data
8888

8989
## Architecture Highlights
9090

91-
Compass implements a **10-component architecture** with:
91+
Compass implements an **8-component architecture** with:
9292

93-
- **Conversational Interface** (Streamlit) - Chat-based requirement gathering
93+
- **Conversational Interface** (Streamlit) - Chat-based requirement gathering with interactive exploration
9494
- **Context & Intent Engine** - LLM-powered extraction of deployment specs
9595
- **Recommendation Engine** - Traffic profiling, model scoring, capacity planning
96-
- **Simulation & Exploration** - What-if analysis and spec editing
9796
- **Deployment Automation** - YAML generation and Kubernetes deployment
9897
- **Knowledge Base** - Benchmarks, SLO templates, model catalog
9998
- **LLM Backend** - Ollama (llama3.1:8b) for conversational AI
10099
- **Orchestration** - Multi-step workflow coordination
101100
- **Inference Observability** - Real-time deployment monitoring
102-
- **vLLM Simulator** - GPU-free local development
101+
102+
**Development Tools:**
103+
- **vLLM Simulator** - GPU-free local development and testing
103104

104105
See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for detailed system design.
105106

0 commit comments

Comments
 (0)