Skip to content

fix: byok context window usage#4982

Open
abwuge wants to merge 4 commits intomicrosoft:mainfrom
abwuge:fix/byok-context-window-usage
Open

fix: byok context window usage#4982
abwuge wants to merge 4 commits intomicrosoft:mainfrom
abwuge:fix/byok-context-window-usage

Conversation

@abwuge
Copy link
Copy Markdown

@abwuge abwuge commented Apr 5, 2026

fix: Pass real usage data from BYOK providers to context window widget

Problem

BYOK (Bring Your Own Key) models registered via the VS Code Language Model API always show an empty context window usage circle in the chat panel. This happens because the usage data from BYOK providers is silently discarded during the stream bridging process.

Root Cause

The data flow for BYOK models is:

BYOK Provider (e.g. OpenAI, Anthropic via custom key)
  → chatMLFetcher (gets real usage from API ✅)
    → CopilotLanguageModelWrapper._provideLanguageModelResponse() (discards usage ❌)
      → VS Code LM API stream (no usage part type available ❌)
        → ExtensionContributedChatEndpoint (hardcodes usage: {0,0,0} ❌)
          → toolCallingLoop → context window widget (shows empty ❌)

Specifically:

  1. _provideLanguageModelResponse() returned void, discarding the APIUsage from the fetcher result
  2. The VS Code LM API's LanguageModelResponsePart2 has no native usage part type
  3. ExtensionContributedChatEndpoint hardcoded usage: { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 } because it had no way to receive actual usage

Solution

Use the existing LanguageModelDataPart pattern (already used for StatefulMarker, ContextManagement, ThinkingData, and PhaseData) to bridge usage data through the VS Code LM API stream.

Data flow after fix:

BYOK Provider
  → chatMLFetcher (gets real usage ✅)
    → CopilotLanguageModelWrapper._provideLanguageModelResponse() (returns APIUsage ✅)
      → provideLanguageModelResponse() reports LanguageModelDataPart('usage') ✅
        → VS Code LM API stream (carries usage as DataPart ✅)
          → ExtensionContributedChatEndpoint (extracts usage from DataPart ✅)
            → toolCallingLoop → context window widget (shows real usage ✅)

Changes

src/platform/endpoint/common/endpointTypes.ts

  • Added Usage = 'usage' to CustomDataPartMimeTypes namespace

src/extension/conversation/vscode-node/languageModelAccess.ts

  • Changed _provideLanguageModelResponse() return type from Promise<void> to Promise<APIUsage | undefined>
  • Added return result.usage at the end of the success path
  • In provideLanguageModelResponse(), captured the returned usage and reported it via LanguageModelDataPart with the custom usage mime type

src/platform/endpoint/vscode-node/extChatEndpoint.ts

  • Added isApiUsage import for runtime validation
  • Added reportedUsage variable to capture usage from the stream
  • Added handler for CustomDataPartMimeTypes.Usage DataPart with:
    • JSON parsing in try/catch (handles malformed data)
    • isApiUsage() runtime validation (rejects invalid field types)
    • Last-write-wins semantics for multiple DataParts
  • Changed hardcoded {0,0,0} to reportedUsage ?? {0,0,0} fallback

src/platform/endpoint/vscode-node/test/extChatEndpoint.spec.ts (NEW)

  • First test file for ExtensionContributedChatEndpoint
  • 6 test cases:
    1. Extract usage from valid Usage DataPart
    2. Fall back to zero when no Usage DataPart present
    3. Fall back to zero for malformed JSON data
    4. Reject usage with invalid field types (string instead of number)
    5. Extract usage when DataPart arrives before text chunks
    6. Report usage correctly when finishedCb is provided

Stats

  • 4 files changed, +205 lines, -4 lines
  • 6 new tests, all passing

Testing

  • TypeScript compilation: 0 errors
  • Vitest: 6/6 tests passing
  • Local dev mode testing: context window usage circle shows real token counts for BYOK models

abwuge added 3 commits April 5, 2026 14:07
BYOK models registered via VS Code LM API had their usage data discarded,
causing the context window usage circle to always show empty. This fix uses
LanguageModelDataPart with a custom 'usage' mime type to bridge usage data
from the BYOK provider (CopilotLanguageModelWrapper) through the VS Code LM
API stream to ExtensionContributedChatEndpoint.
Prevents malformed or invalid usage data (e.g. string fields instead of
numbers) from being accepted. Adds a test for invalid field types.
Add clarifying comment for multiple Usage DataPart semantics.
Add tests: Usage before text arrival order, finishedCb provided scenario.
Copilot AI review requested due to automatic review settings April 5, 2026 06:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes BYOK (Bring Your Own Key) token usage reporting in the chat panel by bridging real APIUsage through the VS Code Language Model API stream using a custom LanguageModelDataPart, allowing the context window widget to display accurate usage.

Changes:

  • Add a new custom data-part mime type (CustomDataPartMimeTypes.Usage) for usage payloads.
  • Update CopilotLanguageModelWrapper to return fetch APIUsage and emit it as a LanguageModelDataPart('usage').
  • Update ExtensionContributedChatEndpoint to extract/validate usage from the stream and include it in the ChatResponse, plus add unit tests covering parsing/fallback behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/platform/endpoint/common/endpointTypes.ts Adds CustomDataPartMimeTypes.Usage constant to identify usage DataParts.
src/extension/conversation/vscode-node/languageModelAccess.ts Returns APIUsage from the wrapper request path and emits it via LanguageModelDataPart on the VS Code LM stream.
src/platform/endpoint/vscode-node/extChatEndpoint.ts Captures usage DataParts from the contributed LM stream and uses them in the returned ChatResponse (with fallback).
src/platform/endpoint/vscode-node/test/extChatEndpoint.spec.ts Adds Vitest coverage to ensure usage extraction/validation and fallback behavior works across stream orderings.
Comments suppressed due to low confidence (1)

src/extension/conversation/vscode-node/languageModelAccess.ts:32

  • APIUsage is only used as a type here. Please switch this to a type-only import (e.g., import type { APIUsage } ...) to match the surrounding import style and avoid accidentally introducing a runtime dependency if TS compiler settings change.
import { ILogService } from '../../../platform/log/common/logService';
import { isAnthropicToolSearchEnabled } from '../../../platform/networking/common/anthropic';
import { FinishedCallback, OpenAiFunctionTool, OptionalChatRequestParams } from '../../../platform/networking/common/fetch';
import { APIUsage } from '../../../platform/networking/common/openai';
import { IChatEndpoint, IEndpoint } from '../../../platform/networking/common/networking';
import { IOTelService, type OTelModelOptions } from '../../../platform/otel/common/otelService';
import { retrieveCapturingTokenByCorrelation, runWithCapturingToken } from '../../../platform/requestLogger/node/requestLogger';
import { IExperimentationService } from '../../../platform/telemetry/common/nullExperimentationService';

@abwuge abwuge changed the title Fix/byok context window usage fix: byok context window usage Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants