|
| 1 | +# M1.5.5: Adapter Performance Monitoring |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +M1.5.5 extends the AI adapter system with comprehensive performance tracking and adaptive timeout adjustment. This enables the system to: |
| 6 | + |
| 7 | +- Track per-adapter latency percentiles (p50, p95, p99) |
| 8 | +- Detect performance degradation automatically |
| 9 | +- Adjust timeouts based on actual performance metrics |
| 10 | +- Recommend best-performing adapters |
| 11 | +- Report system-wide health status |
| 12 | + |
| 13 | +## Architecture |
| 14 | + |
| 15 | +### Components |
| 16 | + |
| 17 | +#### 1. **PerformanceMetrics** (`performance.rs`) |
| 18 | +Tracks real-time metrics for individual adapters: |
| 19 | +- Maintains a ringbuffer of last 100 latencies |
| 20 | +- Calculates success rate (successes / total requests) |
| 21 | +- Computes latency percentiles |
| 22 | +- Detects performance degradation (50%+ latency increase) |
| 23 | +- Tracks throughput (requests per second) |
| 24 | + |
| 25 | +```rust |
| 26 | +pub struct PerformanceMetrics { |
| 27 | + pub adapter_name: String, |
| 28 | + pub latencies: VecDeque<u32>, // ringbuffer of last 100 latencies |
| 29 | + pub successful_requests: u32, |
| 30 | + pub failed_requests: u32, |
| 31 | + pub last_recorded: chrono::DateTime<chrono::Utc>, |
| 32 | + pub throughput_rps: f32, |
| 33 | +} |
| 34 | +``` |
| 35 | + |
| 36 | +#### 2. **AdaptiveTimeout** (`adaptive_timeout.rs`) |
| 37 | +Adjusts timeout values based on performance metrics: |
| 38 | +- Baseline timeout (default 30000ms) |
| 39 | +- Calculates new timeout from p95 latency + margin (default 120%) |
| 40 | +- Respects min adjustment interval (default 5 seconds) to prevent thrashing |
| 41 | +- Never goes below baseline |
| 42 | + |
| 43 | +```rust |
| 44 | +pub struct AdaptiveTimeout { |
| 45 | + pub adapter_name: String, |
| 46 | + pub baseline_ms: u32, |
| 47 | + pub current_timeout_ms: u32, |
| 48 | + pub p95_margin_percent: f32, // e.g., 120% |
| 49 | + pub last_adjusted: chrono::DateTime<chrono::Utc>, |
| 50 | + pub adjustment_interval_secs: u64, |
| 51 | +} |
| 52 | +``` |
| 53 | + |
| 54 | +#### 3. **PerformanceTracker** (`performance_tracker.rs`) |
| 55 | +Aggregates metrics across all adapters and provides high-level insights: |
| 56 | +- Manages metrics + timeout for each adapter |
| 57 | +- Recommends best-performing adapter |
| 58 | +- Generates performance reports with system health status |
| 59 | +- Updates all timeouts based on current metrics |
| 60 | + |
| 61 | +```rust |
| 62 | +pub struct PerformanceTracker { |
| 63 | + adapters: RwLock<HashMap<String, (Arc<RwLock<PerformanceMetrics>>, Arc<RwLock<AdaptiveTimeout>>)>>, |
| 64 | +} |
| 65 | +``` |
| 66 | + |
| 67 | +#### 4. **PerformanceReport** |
| 68 | +Aggregated view of system performance: |
| 69 | + |
| 70 | +```rust |
| 71 | +pub struct PerformanceReport { |
| 72 | + pub timestamp: chrono::DateTime<chrono::Utc>, |
| 73 | + pub adapters: Vec<AdapterPerformance>, |
| 74 | + pub overall_health: SystemHealth, |
| 75 | +} |
| 76 | + |
| 77 | +pub enum SystemHealth { |
| 78 | + Healthy, |
| 79 | + Degraded, |
| 80 | + Critical, |
| 81 | +} |
| 82 | +``` |
| 83 | + |
| 84 | +### Registry Integration |
| 85 | + |
| 86 | +`AdapterRegistry` now optionally includes a `PerformanceTracker`: |
| 87 | + |
| 88 | +```rust |
| 89 | +pub struct AdapterRegistry { |
| 90 | + pub adapters: RwLock<HashMap<String, (SharedAdapter, Arc<RwLock<AdapterHealth>>)>>, |
| 91 | + pub priority: Vec<String>, |
| 92 | + pub performance_tracker: Option<Arc<PerformanceTracker>>, |
| 93 | +} |
| 94 | +``` |
| 95 | + |
| 96 | +New constructors: |
| 97 | +- `new_with_tracker(tracker: Arc<PerformanceTracker>)` - Create registry with tracker |
| 98 | +- `with_priority_and_tracker(priority, tracker)` - Custom priority + tracker |
| 99 | + |
| 100 | +New methods: |
| 101 | +- `get_timeout(adapter_name: &str) -> Result<u32>` - Get current timeout |
| 102 | +- `get_performance_report() -> Result<PerformanceReport>` - Get system report |
| 103 | + |
| 104 | +## Usage Flow |
| 105 | + |
| 106 | +### Basic Performance Tracking |
| 107 | + |
| 108 | +```rust |
| 109 | +let tracker = Arc::new(PerformanceTracker::new()); |
| 110 | + |
| 111 | +tracker.register("gemini".to_string(), 30000); // 30s baseline |
| 112 | +tracker.register("ollama".to_string(), 20000); // 20s baseline |
| 113 | + |
| 114 | +tracker.record_success("gemini", 150).ok(); // 150ms latency |
| 115 | +tracker.record_success("ollama", 120).ok(); |
| 116 | +tracker.record_failure("ollama").ok(); // failed request |
| 117 | + |
| 118 | +let report = tracker.get_performance_report(); |
| 119 | +println!("{:?}", report.overall_health); |
| 120 | +``` |
| 121 | + |
| 122 | +### Adapter Selection with Performance |
| 123 | + |
| 124 | +```rust |
| 125 | +let recommended = tracker.recommend_adapter().unwrap(); |
| 126 | +println!("Use adapter: {}", recommended); // Returns best performer |
| 127 | +``` |
| 128 | + |
| 129 | +### Timeout Adjustment |
| 130 | + |
| 131 | +```rust |
| 132 | +tracker.update_all_timeouts().ok(); // Adjust based on metrics |
| 133 | + |
| 134 | +let new_timeout = tracker.get_timeout("gemini").unwrap(); |
| 135 | +``` |
| 136 | + |
| 137 | +### Registry Integration |
| 138 | + |
| 139 | +```rust |
| 140 | +let tracker = Arc::new(PerformanceTracker::new()); |
| 141 | +let registry = AdapterRegistry::new_with_tracker(tracker.clone()); |
| 142 | + |
| 143 | +tracker.register("gemini".to_string(), 30000); |
| 144 | +let timeout = registry.get_timeout("gemini")?; |
| 145 | +let report = registry.get_performance_report()?; |
| 146 | +``` |
| 147 | + |
| 148 | +## Algorithm Details |
| 149 | + |
| 150 | +### Success Rate Calculation |
| 151 | +``` |
| 152 | +success_rate = (successful_requests / (successful_requests + failed_requests)) * 100 |
| 153 | +``` |
| 154 | + |
| 155 | +### Latency Percentile Calculation |
| 156 | +1. Sort all recorded latencies |
| 157 | +2. Find value at (p / 100) * length position |
| 158 | + |
| 159 | +### Adaptive Timeout Calculation |
| 160 | +``` |
| 161 | +new_timeout = max( |
| 162 | + (p95_latency * p95_margin_percent) / 100, |
| 163 | + baseline_ms |
| 164 | +) |
| 165 | +``` |
| 166 | + |
| 167 | +### Degradation Detection |
| 168 | +- Compares p95 latency of recent half vs older half |
| 169 | +- Flags if recent > older by 50%+ |
| 170 | +- Requires minimum 20 recorded latencies |
| 171 | + |
| 172 | +### Adapter Recommendation Score |
| 173 | +``` |
| 174 | +score = success_rate - (average_latency / 100) |
| 175 | +``` |
| 176 | +Recommends adapter with highest score. |
| 177 | + |
| 178 | +### System Health Status |
| 179 | +- **Healthy**: All adapters healthy, success rates > 50% |
| 180 | +- **Degraded**: More than 50% of adapters degraded |
| 181 | +- **Critical**: Any adapter with success rate < 50% |
| 182 | + |
| 183 | +## Data Flow |
| 184 | + |
| 185 | +``` |
| 186 | +User Request |
| 187 | + ↓ |
| 188 | +Record Success/Failure |
| 189 | + ↓ |
| 190 | +Update Metrics (latency, counts) |
| 191 | + ↓ |
| 192 | +(Every 5+ seconds) Adjust Timeout based on p95 |
| 193 | + ↓ |
| 194 | +(Query) Get Performance Report |
| 195 | + ↓ |
| 196 | +Recommend Best Adapter |
| 197 | + ↓ |
| 198 | +Return Metrics |
| 199 | +``` |
| 200 | + |
| 201 | +## Test Coverage |
| 202 | + |
| 203 | +**Unit Tests** (30): |
| 204 | +- PerformanceMetrics: 10 tests |
| 205 | +- AdaptiveTimeout: 8 tests |
| 206 | +- PerformanceTracker: 12 tests |
| 207 | + |
| 208 | +**Integration Tests** (17): |
| 209 | +- Registry + Performance Tracker: 5 tests |
| 210 | +- Factory + Performance Tracker: 3 tests |
| 211 | + |
| 212 | +**Full Integration Tests** (9): |
| 213 | +- End-to-end flows with all components |
| 214 | + |
| 215 | +**Total: 56+ tests** |
| 216 | + |
| 217 | +## Configuration |
| 218 | + |
| 219 | +### PerformanceMetrics |
| 220 | +- Latency ringbuffer: 100 entries |
| 221 | +- Percentiles calculated on demand |
| 222 | + |
| 223 | +### AdaptiveTimeout |
| 224 | +- Baseline timeout: configurable per adapter |
| 225 | +- P95 margin: 120% (configurable) |
| 226 | +- Min adjustment interval: 5 seconds |
| 227 | +- Never goes below baseline |
| 228 | + |
| 229 | +### PerformanceTracker |
| 230 | +- Auto-cleanup on update_all_timeouts() |
| 231 | +- Recommends based on success_rate - latency score |
| 232 | + |
| 233 | +## Error Handling |
| 234 | + |
| 235 | +All operations return `AdapterResult<T>`: |
| 236 | +- `AdapterNotFound`: Adapter not registered |
| 237 | +- `AllAdaptersFailed`: No adapters available / tracker not initialized |
| 238 | + |
| 239 | +## Future Enhancements |
| 240 | + |
| 241 | +1. **Persistence**: Save metrics to storage between sessions |
| 242 | +2. **Thresholds**: Configurable degradation detection thresholds |
| 243 | +3. **Alerts**: Notify when adapter performance drops below threshold |
| 244 | +4. **Metrics Export**: Prometheus/Grafana integration |
| 245 | +5. **Per-Model Metrics**: Track performance per model type |
| 246 | +6. **Cost Tracking**: Integrate with PerformanceTracker for cost optimization |
| 247 | + |
| 248 | +## Related Milestones |
| 249 | + |
| 250 | +- **M1.5.1-M1.5.4**: AI adapter infrastructure |
| 251 | +- **M1.6**: Integration & CLI integration of adapters with monitoring |
0 commit comments