Skip to content

Conversation

@phact
Copy link
Collaborator

@phact phact commented Jan 14, 2026

Token Usage Tracking Implementation

  1. Added Usage model to Properties schema (src/lfx/src/lfx/schema/properties.py and src/backend/base/langflow/schema/properties.py):
    - New Usage class with input_tokens, output_tokens, total_tokens fields
    - Added usage field to Properties class
  2. Added usage extraction in LCModelComponent (src/lfx/src/lfx/base/models/model.py):
    - New extract_usage() method to extract token usage from AIMessage response_metadata
    - Supports OpenAI format (token_usage.prompt_tokens/completion_tokens/total_tokens)
    - Supports Anthropic format (usage.input_tokens/output_tokens)
    - Modified _get_chat_result() to set usage on returned Message for non-streaming
  3. Added streaming usage support (src/lfx/src/lfx/custom/custom_component/component.py):
    - Modified _stream_message() to return usage data from final chunk
    - Modified send_message() to set usage on message properties after streaming
  4. Updated OpenAI Responses API endpoint (src/backend/base/langflow/api/v1/openai_responses.py):
    - Non-streaming: Extract usage from component_output.results Message properties
    - Streaming: Capture usage from add_message event with state=complete
    - Added response.completed event with usage in streaming responses
  5. Fixed integration tests (src/backend/tests/integration/test_openai_responses_integration.py):
    - Added programmatic model configuration for LanguageModelComponent
    - Set API key directly in template (workaround for global variable lookup issue in tests)
    - Added usage validation assertions

Summary by CodeRabbit

Release Notes

  • New Features
    • Added token usage metrics to API responses, displaying input tokens, output tokens, and total tokens consumed for each request
    • Token usage is now captured and available in both streaming and non-streaming response formats
    • Usage data is extracted from language model responses and included in all API response payloads

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 14, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This pull request introduces per-response usage metrics (input_tokens, output_tokens, total_tokens) throughout the codebase. A new Usage model is added to schema definitions. Usage data is captured from OpenAI-like responses in both streaming and non-streaming paths and propagated through Message properties and API responses.

Changes

Cohort / File(s) Summary
Schema definitions
src/backend/base/langflow/schema/properties.py, src/lfx/src/lfx/schema/properties.py
New public Usage model introduced with optional fields input_tokens, output_tokens, and total_tokens. New optional usage field added to Properties model in both backend and lfx packages.
API response handling
src/backend/base/langflow/api/v1/openai_responses.py
Usage data extraction and propagation added to both streaming and non-streaming paths. Streaming path captures usage from completed messages and emits response.completed event. Non-streaming path extracts usage from result outputs. Public OpenAIResponsesResponse signature updated to accept usage field.
LLM component integration
src/lfx/src/lfx/base/models/model.py
New public method extract_usage(message: AIMessage) added to parse token usage from OpenAI and Anthropic response metadata. Usage data populated in _get_chat_result flow and attached to Message properties.
Custom component streaming
src/lfx/src/lfx/custom/custom_component/component.py
Stream handling return types updated to tuple[str, dict | None] for _stream_message and _handle_async_iterator. Usage metadata extraction added across streaming chunks with fallback to response_metadata. Captured usage propagated to stored_message properties.
Integration tests
src/backend/tests/integration/test_openai_responses_integration.py
Programmatic configuration of LanguageModelComponent added. Non-streaming test expanded to validate usage object and token fields. Streaming test adds validation for response.completed event containing usage data.

Sequence Diagram

sequenceDiagram
    participant Client
    participant OpenAIAPI as OpenAI API
    participant Handler as OpenAI Response Handler
    participant Schema as Message Properties
    participant Response as OpenAI Response Object

    Client->>OpenAIAPI: Send prompt (streaming or non-streaming)
    OpenAIAPI->>Handler: Return response with usage metadata
    
    alt Streaming Path
        loop For each chunk
            OpenAIAPI-->>Handler: Stream chunk with usage_metadata
            Handler->>Schema: Extract usage (input, output, total tokens)
            Handler->>Handler: Accumulate usage_data
        end
        Handler->>Handler: Message reaches "complete" state
        Handler->>Response: Emit completion event with usage_data
        Response-->>Client: Send response.completed with usage
    else Non-Streaming Path
        Handler->>Schema: Extract usage from result outputs
        Handler->>Handler: Construct usage_data dict
        Handler->>Response: Populate OpenAIResponsesResponse with usage field
        Response-->>Client: Send response with usage_data
    end
    
    Response-->>Client: Final response includes usage metrics
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 2
❌ Failed checks (1 error, 1 warning)
Check name Status Explanation Resolution
Test Coverage For New Implementations ❌ Error PR adds new Usage model classes and extract_usage() method but lacks unit tests for critical implementations including Usage model validation, extract_usage() method, modified streaming methods, and usage extraction logic. Add unit tests for Usage model schema validation, extract_usage() method parsing, streaming method return types, and openai_responses.py usage extraction logic to complement existing integration tests.
Test Quality And Coverage ⚠️ Warning Test coverage for token usage tracking is incomplete: no unit tests for Usage schema, extract_usage() method, or streaming usage modifications despite integration tests being added. Add unit tests for Usage model, extract_usage() method, and streaming usage; expand integration tests to include error scenarios, edge cases, and async error handling validation.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Test File Naming And Structure ✅ Passed Test file follows all specified patterns: proper naming convention, pytest structure with decorators, descriptive test function names, logical organization with helper functions, comprehensive coverage of positive/negative scenarios and edge cases.
Excessive Mock Usage Warning ✅ Passed Integration tests use real component configuration and API interactions without evidence of excessive mock usage, following proper test design practices.
Title check ✅ Passed The title 'feat: token usage tracking in responses api' clearly and concisely describes the main change—adding token usage tracking to the responses API. It is specific, relates to the core functionality introduced across multiple files, and accurately summarizes the primary objective of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch openai-responses-usage

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Jan 14, 2026

Codecov Report

❌ Patch coverage is 27.27273% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 32.67%. Comparing base (0e03376) to head (0a776e7).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...c/lfx/src/lfx/custom/custom_component/component.py 16.66% 22 Missing and 3 partials ⚠️
src/lfx/src/lfx/base/models/model.py 9.09% 20 Missing ⚠️
...c/backend/base/langflow/api/v1/openai_responses.py 26.66% 11 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #11302      +/-   ##
==========================================
- Coverage   34.25%   32.67%   -1.58%     
==========================================
  Files        1409     1409              
  Lines       66892    66969      +77     
  Branches     9860     9881      +21     
==========================================
- Hits        22912    21884    -1028     
- Misses      42787    43889    +1102     
- Partials     1193     1196       +3     
Flag Coverage Δ
backend 47.73% <45.00%> (-5.87%) ⬇️
frontend 15.99% <ø> (-0.01%) ⬇️
lfx 40.75% <21.05%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/backend/base/langflow/schema/properties.py 91.17% <100.00%> (+1.52%) ⬆️
src/lfx/src/lfx/schema/properties.py 82.35% <100.00%> (+3.04%) ⬆️
...c/backend/base/langflow/api/v1/openai_responses.py 41.39% <26.66%> (+28.72%) ⬆️
src/lfx/src/lfx/base/models/model.py 20.00% <9.09%> (-1.31%) ⬇️
...c/lfx/src/lfx/custom/custom_component/component.py 57.81% <16.66%> (-1.38%) ⬇️

... and 78 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested in LF. LGTM

@github-actions github-actions bot added the lgtm This PR has been approved by a maintainer label Jan 16, 2026
@edwinjosechittilappilly edwinjosechittilappilly changed the title Feat: token usage tracking in responses api feat: token usage tracking in responses api Jan 16, 2026
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Jan 16, 2026
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Jan 16, 2026
@github-actions
Copy link
Contributor

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 17%
17.44% (4979/28541) 10.74% (2363/21992) 11.55% (722/6249)

Unit Test Results

Tests Skipped Failures Errors Time
1999 0 💤 0 ❌ 0 🔥 27.155s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants