You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| 3 | Adapter parity + transport dedup | ✅ Done |`BaseBenchmarkClient` extracted to `packages/benchmarks/lib/`. All three adapters subclass + override `_send`. **Eliza $0 cost bug fixed** (server.ts propagates `usage` to HTTP response). 164 tests pass. Commit `6e1d8b200d`. |
21
21
| 4 | Voice benchmarks (VoiceBench, MMAU, VoiceAgentBench) | 🟡 2 of 3 done |**VoiceBench** at `b82b302ccc` (`voicebench-quality/`, 25 tests). **MMAU** at `dc24e51b4c` (`packages/benchmarks/mmau/`, 53 tests). **VoiceAgentBench** still in flight (agent `a58604a002051bfe7`). |
22
-
| 5 | Registry split + stub purge |⛔ Blocked | Wait for VoiceAgentBench to land its registry entry. Then split `registry.py` (now ~44 entries) by domain. |
| P0-6 | Inline LIFE_CREATE wire shape into `_TOOL_DESCRIPTIONS`| 🟡 In flight | Agent `aaeae1fd70b8363d0`. |
39
-
| P0-7 | Bench-server role seeding for `scope_global_vs_user`| ⛔ Not started | Holding — needs design pass on runner cooperation. After P0 batch lands. |
40
-
| P0-8 | Stop read-only ops gifting `state_hash_match`| ⛔ Not started | Holding — coordinate with P0-1 measurement so the lift attribution is clean. |
0 commit comments