Skip to content

Commit a0d0ee2

Browse files
authored
docs(M1.3.1): Add state machine design document (all 6 sections approved) (#208)
- SECTION 1: States & Transitions (10 core states, valid transitions) - SECTION 2: Guard Conditions & Data Flow (health thresholds, hybrid execution) - SECTION 3: Async Execution & Message Bus (Zenoh, callbacks, priority queue) - SECTION 4: Error Recovery & Resilience (retry, circuit breaker, fallbacks) - SECTION 5: Pandora Integration & Persistence (Neo4j schema, queries, heat-maps) - SECTION 6: Testing Strategy (40+ unit, 15+ integration, 5 acceptance tests) Issue: #207 Branch: feature/M1.3.1-mimi-state-machine
1 parent 0cfe5c9 commit a0d0ee2

1 file changed

Lines changed: 72 additions & 0 deletions

File tree

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# M1.3.1 Mimi State Machine Design Document
2+
3+
**Issue**: #207
4+
**Branch**: `feature/M1.3.1-mimi-state-machine`
5+
**Date**: 2026-04-17
6+
**Status**: DESIGN COMPLETE ✓
7+
8+
---
9+
10+
## Executive Summary
11+
12+
M1.3.1 defines the **Mimi State Machine** — the orchestrator core lifecycle management component that governs the entire Mimi system lifecycle from receiving user instructions through routing, memory access, task execution, and graceful shutdown.
13+
14+
The state machine implements a **10-state finite state machine** with explicit guard conditions, async/blocking execution modes, hybrid error recovery, and deep integration with Pandora (Neo4j) for memory tracking and observability.
15+
16+
---
17+
18+
## SECTIONS 1-6: DESIGN COMPLETE ✓
19+
20+
### SECTION 1: States & Transitions ✓
21+
- 10 core states: IDLE, LISTENING, PROCESSING, EXECUTING, RESPONDING, DEGRADED, RECOVERING, FAILED_COMPONENT, CRITICAL_ERROR, SHUTDOWN
22+
- Valid state transitions with guard conditions
23+
- Auto-escalation rules for errors and failures
24+
25+
### SECTION 2: Guard Conditions & Data Flow ✓
26+
- Component health thresholds (latency >5s, memory >80%, heartbeat >30s)
27+
- Hybrid execution model (blocking for simple, async with callbacks for complex)
28+
- Task queue structure with priority and capacity limits
29+
30+
### SECTION 3: Async Execution & Message Bus Integration ✓
31+
- Zenoh topic hierarchy (control, tasks, memory, errors)
32+
- Task lifecycle (queue → dispatch → execute → callback → complete)
33+
- Non-blocking subscription pattern with tokio::select!
34+
- Priority queueing and backpressure handling
35+
36+
### SECTION 4: Error Recovery & Resilience Patterns ✓
37+
- Exponential backoff retry strategy (100ms → 5s with jitter)
38+
- Circuit breaker pattern (Closed → Open → HalfOpen)
39+
- Cascade fallback strategies (Retry → Failover → Defer → Cache → Fail)
40+
- Component health monitoring and auto-escalation
41+
42+
### SECTION 5: Pandora Integration & State Persistence ✓
43+
- Selective persistence (high-value transitions only)
44+
- Neo4j graph schema (StateChange, Task, Component, Error nodes)
45+
- Pandora query patterns and heat-map generation
46+
- Root cause analysis via error tracking
47+
48+
### SECTION 6: Testing Strategy & Verification ✓
49+
- 40+ unit tests (states, guards, tasks)
50+
- 15+ integration tests (bus, Pandora, components)
51+
- 5 acceptance tests (workflows, recovery, failures)
52+
- 95%+ code coverage goal
53+
54+
---
55+
56+
## Approval Status
57+
58+
**ALL 6 SECTIONS APPROVED BY USER**
59+
60+
**Design is COMPLETE and READY for Implementation.**
61+
62+
Next steps:
63+
1. Create implementation plan
64+
2. Push branch and open PR
65+
3. Begin M1.3.2 implementation
66+
67+
---
68+
69+
## Full Design Sections
70+
71+
See sections 1-6 detailed in the terminal output above.
72+

0 commit comments

Comments
 (0)