feat(chat): surface per-session token usage and cost in the chat UI#1193
Open
srtab wants to merge 12 commits into
Open
feat(chat): surface per-session token usage and cost in the chat UI#1193srtab wants to merge 12 commits into
srtab wants to merge 12 commits into
Conversation
Collapse three hand-rolled usage_summary dicts (views, streaming, empty-state) into a single build_thread_usage_payload helper so the wire shape has one source of truth. Lift the SSE event name into a shared USAGE_SUMMARY_EVENT_NAME constant. Tighten apply_usage_delta: - wrap Decimal coercion in try/except for both cumulative and per-model paths; degrade to unpriced with logger.error/warning including thread_id, model, and run totals - skip usage_by_model writes (and the changed entry) when the delta carries no per-model data - guard cost_priced/cost_usd appends so the except branch never emits redundant update_fields entries Replace lru_cache on get_context_window with a manual dict that caches only successful int lookups, so a transient registry blip no longer turns into a permanent UI degradation for the worker's lifetime. Extracts only the int from BaseAgent.get_model so the full BaseChatModel client isn't retained. Trim apply_usage_delta_to_thread to .only(...) the columns it touches. JS: replace the field-by-field equality check with a JSON.stringify structural compare so future wire-shape additions don't get silently dropped. Tests: cover the Decimal error paths (InvalidOperation + TypeError arms, both cumulative and per-model), the empty by_model skip, get_context_window cache hit/miss/no-cache-on-failure, and the build_thread_usage_payload contract. Mock BaseAgent.get_model so the suite no longer depends on the live LangChain registry.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Surfaces per-session token usage and USD cost on the chat thread page. Persists denormalized cumulative totals on
ChatThread, with a sharedTokenUsageRecordabstract base reused byActivity. Live updates flow through a newdaiv.usage_summaryAG-UICustomEventemitted at end-of-run; initial render reads from the denormalized columns.core.TokenUsageRecordabstract base withapply_usage_snapshot.Activityinherits (no-op migration).ChatThreadinherits and adds wider cumulative columns +cache_read_tokens,cache_write_tokens,last_input_tokens,last_model_name,cost_priced, plus anapply_usage_deltamethod with sticky-null cost semantics.LastCallUsageMetadataCallbackHandlertracks the most recent LLM call.track_usage_metadata(handler_class=...)opts into richer subclasses; the return type is now generic via TypeVar.track_usage_metadata, applies the delta to the thread, and emits adaiv.usage_summaryCustomEvent carrying cumulative totals. A failure during the apply/emit phase is logged but does not turn a clean run into RUN_ERROR.ChatThreadDetailViewexposesusage_summaryfor hydration; chat-stream.js handles the new CUSTOM event. Rail block + responsive summary chip render cost / tokens / context-window %.Test Plan
make test— 1842 tests passingmake lint-fix— cleanNotes
docs/superpowers/plans/2026-05-09-chat-session-usage-cost.md(gitignored)Activitymigration is state-only — field declarations are byte-identical via the mixin, so Django reports "No changes detected".ChatThreadmigration adds the new columns at full width (PositiveBigIntegerField, DecimalField(12,6)).