Skip to content

fix: use current model's context window for usage_update size#412

Open
timvisher-dd wants to merge 2 commits intozed-industries:mainfrom
timvisher-dd:timvisher/fix/usage-update-computation
Open

fix: use current model's context window for usage_update size#412
timvisher-dd wants to merge 2 commits intozed-industries:mainfrom
timvisher-dd:timvisher/fix/usage-update-computation

Conversation

@timvisher-dd
Copy link
Contributor

@timvisher-dd timvisher-dd commented Mar 12, 2026

The usage_update notification reports size: 200000 even when the active model has a 1M context window (e.g. opus[1m]), causing clients to display incorrect context utilization (e.g. 689k/200k (344.3%) instead of 689k/1000k (68.9%)).

Four bugs fixed:

  • Min across all models: The original code used Math.min across all modelUsage entries, so subagent models (Sonnet/Haiku with 200k windows) dragged down the reported size for the main Opus 1M model. Now tracks the top-level assistant model and looks up its context window specifically.
  • Model name mismatch: The SDK's streaming path keys modelUsage by the requested model alias (e.g. claude-opus-4-6) while BetaMessage.model on assistant messages has the resolved API response model (e.g. claude-opus-4-6-20250514). The exact-match lookup always missed, falling back to the hardcoded 200k default. Now falls back to prefix matching, preferring the longest/most-specific match.
  • Synthetic messages corrupt model tracking: /compact and similar commands emit assistant messages with model: "<synthetic>". These were updating lastAssistantModel, causing the next usage_update to miss the modelUsage lookup and fall back to the 200k default. Now filters out <synthetic> models.
  • Stale usage after compaction: No usage_update was sent on compact_boundary, so clients kept showing the pre-compaction context size (e.g. 944k/1m) right after "Compacting completed" until the next full turn. Now sends used: 0 immediately on compaction. This is a deliberate approximation — the exact post-compaction size isn't known until the SDK's next API call, which replaces it within seconds. The alternative (no update) is worse UX: showing a full context bar right after compaction.

Eight new tests cover: token sum correctness, current-model context window lookup, model switching, subagent isolation, prefix matching in both directions, and synthetic message filtering.

Note: bin/test (local CI validation script) is cherry-picked from #353.

Would fix agent-shell's usage indicator which currently has to defend against this broken math: xenodium/agent-shell#364

Testing

  • Manual riding with it for a bit.

@cla-bot cla-bot bot added the cla-signed label Mar 12, 2026
@timvisher-dd timvisher-dd force-pushed the timvisher/fix/usage-update-computation branch from 0962531 to 3abd1a0 Compare March 12, 2026 18:16
@timvisher-dd timvisher-dd marked this pull request as ready for review March 12, 2026 18:17
@timvisher-dd timvisher-dd force-pushed the timvisher/fix/usage-update-computation branch 2 times, most recently from 36c524a to ac1745a Compare March 13, 2026 19:39
timvisher-dd and others added 2 commits March 13, 2026 16:03
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously usage_update.size used the minimum context window across all
models in modelUsage. Now it looks up the context window for the model
from the most recent top-level assistant message, with prefix matching
to handle alias/versioned model ID mismatches.

Also skips <synthetic> assistant messages (e.g. from /compact) so they
don't corrupt the tracked model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@timvisher-dd timvisher-dd force-pushed the timvisher/fix/usage-update-computation branch from ac1745a to acdb28f Compare March 13, 2026 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant