Summary
Add per-run cost and duration tracking to Orchestra. Every LLM invocation will record token counts, cost (USD), duration, and phase metadata. Aggregated metrics are surfaced per thread, with configurable budget caps that integrate with the existing HITL interrupt system. The frontend displays real-time cost badges and a detailed breakdown panel.
Key Deliverables
- Per-turn metrics capture: model, input/output tokens, cost, duration, phase (
planning, execution, reflection, tool_call)
- Aggregate metrics: per-thread and per-phase cost summaries
- Provider pricing tables: hardcoded defaults for OpenAI, Anthropic, Google models; admin-configurable overrides via API
- Real-time cost display: running cost badge in chat header, updated via SSE streaming events
- Budget caps: per-thread and per-assistant budget limits with configurable actions (
pause via HITL interrupt, warn, stop)
- Historical cost data: per-turn breakdown and cost analytics accessible through API and frontend panel
Implementation Scope
Backend
- New schemas:
TurnMetrics, ThreadCostSummary, RunBudget in backend/src/schemas/entities/metrics.py
- Pricing constants in
backend/src/constants/pricing.py
- Metrics capture middleware in
backend/src/utils/metrics.py
- Budget enforcement service with HITL integration
- REST API endpoints for metrics, pricing CRUD, and budget management
- Alembic migrations for
turn_metrics, run_budgets, and pricing_overrides tables
Frontend
- Running cost badge in chat header
- Cost breakdown panel (per-turn table, per-phase summary)
- Budget configuration dialog
- Warning indicators at 75%/100% thresholds
Testing
- Unit tests for pricing calculation, budget thresholds, metrics aggregation
- Integration tests for end-to-end metrics capture and budget enforcement
- Frontend component tests
Spec
Full specification: .claude/specs/cost-tracking.md
Implementation plan: .claude/plans/cost-tracking.md
Milestone
v0.9.0 -- Harness Design
Summary
Add per-run cost and duration tracking to Orchestra. Every LLM invocation will record token counts, cost (USD), duration, and phase metadata. Aggregated metrics are surfaced per thread, with configurable budget caps that integrate with the existing HITL interrupt system. The frontend displays real-time cost badges and a detailed breakdown panel.
Key Deliverables
planning,execution,reflection,tool_call)pausevia HITL interrupt,warn,stop)Implementation Scope
Backend
TurnMetrics,ThreadCostSummary,RunBudgetinbackend/src/schemas/entities/metrics.pybackend/src/constants/pricing.pybackend/src/utils/metrics.pyturn_metrics,run_budgets, andpricing_overridestablesFrontend
Testing
Spec
Full specification:
.claude/specs/cost-tracking.mdImplementation plan:
.claude/plans/cost-tracking.mdMilestone
v0.9.0 -- Harness Design