Problem
There's no way to track token consumption per agent or model. The system can waste tokens silently (e.g., context carryover on drain-ack, fixed in PR #98) with no visibility into actual costs.
Existing infrastructure
internal/sessionlog/tail.go — already extracts input_tokens, cache_read_input_tokens, cache_creation_input_tokens from session JSONL
internal/api/handler_agents.go — API exposes model, context_pct per session
internal/telemetry/recorder.go — 17 OTel metrics exist but none for tokens
- Subagent JSONL files tracked at
{slug}/{session-uuid}/subagents/
Proposed solution
Add OTel counters to the telemetry recorder:
agent.tokens.input (counter, attributes: agent_name, model)
agent.tokens.output (counter, attributes: agent_name, model)
- Optionally:
agent.tokens.cache_read, agent.tokens.cache_creation
Record from session log data at session stop or drain-ack time. Include subagent token usage in the parent agent's totals.
Why
Cost visibility is the biggest remaining gap after the session reset fixes. Operators need to identify which agents/models are burning tokens to optimize configuration.
Problem
There's no way to track token consumption per agent or model. The system can waste tokens silently (e.g., context carryover on drain-ack, fixed in PR #98) with no visibility into actual costs.
Existing infrastructure
internal/sessionlog/tail.go— already extractsinput_tokens,cache_read_input_tokens,cache_creation_input_tokensfrom session JSONLinternal/api/handler_agents.go— API exposesmodel,context_pctper sessioninternal/telemetry/recorder.go— 17 OTel metrics exist but none for tokens{slug}/{session-uuid}/subagents/Proposed solution
Add OTel counters to the telemetry recorder:
agent.tokens.input(counter, attributes: agent_name, model)agent.tokens.output(counter, attributes: agent_name, model)agent.tokens.cache_read,agent.tokens.cache_creationRecord from session log data at session stop or drain-ack time. Include subagent token usage in the parent agent's totals.
Why
Cost visibility is the biggest remaining gap after the session reset fixes. Operators need to identify which agents/models are burning tokens to optimize configuration.