Skip to content

feat(chat): surface per-session token usage and cost in the chat UI#1193

Open
srtab wants to merge 12 commits into
mainfrom
feat/chat-session-usage-cost
Open

feat(chat): surface per-session token usage and cost in the chat UI#1193
srtab wants to merge 12 commits into
mainfrom
feat/chat-session-usage-cost

Conversation

@srtab

@srtab srtab commented May 9, 2026

Copy link
Copy Markdown
Owner

Summary

Surfaces per-session token usage and USD cost on the chat thread page. Persists denormalized cumulative totals on ChatThread, with a shared TokenUsageRecord abstract base reused by Activity. Live updates flow through a new daiv.usage_summary AG-UI CustomEvent emitted at end-of-run; initial render reads from the denormalized columns.

  • Schema: core.TokenUsageRecord abstract base with apply_usage_snapshot. Activity inherits (no-op migration). ChatThread inherits and adds wider cumulative columns + cache_read_tokens, cache_write_tokens, last_input_tokens, last_model_name, cost_priced, plus an apply_usage_delta method with sticky-null cost semantics.
  • Agent: New LastCallUsageMetadataCallbackHandler tracks the most recent LLM call. track_usage_metadata(handler_class=...) opts into richer subclasses; the return type is now generic via TypeVar.
  • Streamer: Wraps the run with track_usage_metadata, applies the delta to the thread, and emits a daiv.usage_summary CustomEvent carrying cumulative totals. A failure during the apply/emit phase is logged but does not turn a clean run into RUN_ERROR.
  • View / template: ChatThreadDetailView exposes usage_summary for hydration; chat-stream.js handles the new CUSTOM event. Rail block + responsive summary chip render cost / tokens / context-window %.

Test Plan

  • make test — 1842 tests passing
  • make lint-fix — clean
  • Manual smoke: open an existing thread → confirm rail block hydrates from server-rendered usage_summary
  • Manual smoke: send a new message → confirm rail block updates after RUN_FINISHED via the daiv.usage_summary CustomEvent
  • Manual smoke: reload the page → confirm post-run totals persist (denormalization)
  • Manual smoke: check responsive summary chip appears below 1100px viewport width

Notes

  • Plan: docs/superpowers/plans/2026-05-09-chat-session-usage-cost.md (gitignored)
  • The Activity migration is state-only — field declarations are byte-identical via the mixin, so Django reports "No changes detected".
  • The ChatThread migration adds the new columns at full width (PositiveBigIntegerField, DecimalField(12,6)).

srtab added 12 commits May 9, 2026 14:32
Collapse three hand-rolled usage_summary dicts (views, streaming,
empty-state) into a single build_thread_usage_payload helper so the
wire shape has one source of truth. Lift the SSE event name into a
shared USAGE_SUMMARY_EVENT_NAME constant.

Tighten apply_usage_delta:
- wrap Decimal coercion in try/except for both cumulative and
  per-model paths; degrade to unpriced with logger.error/warning
  including thread_id, model, and run totals
- skip usage_by_model writes (and the changed entry) when the
  delta carries no per-model data
- guard cost_priced/cost_usd appends so the except branch never
  emits redundant update_fields entries

Replace lru_cache on get_context_window with a manual dict that
caches only successful int lookups, so a transient registry blip
no longer turns into a permanent UI degradation for the worker's
lifetime. Extracts only the int from BaseAgent.get_model so the
full BaseChatModel client isn't retained.

Trim apply_usage_delta_to_thread to .only(...) the columns it
touches.

JS: replace the field-by-field equality check with a JSON.stringify
structural compare so future wire-shape additions don't get
silently dropped.

Tests: cover the Decimal error paths (InvalidOperation + TypeError
arms, both cumulative and per-model), the empty by_model skip,
get_context_window cache hit/miss/no-cache-on-failure, and the
build_thread_usage_payload contract. Mock BaseAgent.get_model so
the suite no longer depends on the live LangChain registry.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant