Skip to content

feat: add per-model-request token usage tracking for LLM cost visibility#13726

Open
Kingsuperyzy wants to merge 19 commits intoinfiniflow:mainfrom
Kingsuperyzy:main
Open

feat: add per-model-request token usage tracking for LLM cost visibility#13726
Kingsuperyzy wants to merge 19 commits intoinfiniflow:mainfrom
Kingsuperyzy:main

Conversation

@Kingsuperyzy
Copy link
Copy Markdown
Contributor

@Kingsuperyzy Kingsuperyzy commented Mar 20, 2026

What problem does this PR solve?

This PR introduces per-model-request token usage tracking for model requests in RAGFlow. It intercepts model calls at the middleware layer and persists token consumption data to the database, enabling precise visibility into LLM usage costs at the request level.

Type of change

  • New Feature (non-breaking change which adds functionality)

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. 🌈 python Pull requests that update Python code 💞 feature Feature request, pull request that fullfill a new feature. labels Mar 20, 2026
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Mar 20, 2026
@Kingsuperyzy Kingsuperyzy marked this pull request as draft March 20, 2026 09:28
@Kingsuperyzy Kingsuperyzy marked this pull request as ready for review March 20, 2026 09:30
@yingfeng yingfeng requested a review from Lynn-Inf March 24, 2026 15:47
@yingfeng yingfeng added the ci Continue Integration label Mar 24, 2026
@yingfeng yingfeng marked this pull request as draft March 24, 2026 15:47
@yingfeng yingfeng marked this pull request as ready for review March 24, 2026 15:47
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.52%. Comparing base (384fa6f) to head (f82dd96).

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13726      +/-   ##
==========================================
- Coverage   98.11%   96.52%   -1.60%     
==========================================
  Files          10       10              
  Lines         690      690              
  Branches      108      108              
==========================================
- Hits          677      666      -11     
- Misses          4        8       +4     
- Partials        9       16       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@Lynn-Inf Lynn-Inf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to pass the biz_type and biz_id parameters when instantiating LLMBundle. Also, please make sure to use English for comments.

@dosubot dosubot bot removed the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 2, 2026
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Apr 2, 2026
@Kingsuperyzy
Copy link
Copy Markdown
Contributor Author

Kingsuperyzy commented Apr 2, 2026

@Lynn-Inf
I’ve made the following updates based on your suggestions.

  1. Pass biz_type, biz_id, and session_id parameters when instantiating LLMBundle across all production code paths.
  2. Ensure all comments in modified files are written in English.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Continue Integration 💞 feature Feature request, pull request that fullfill a new feature. 🌈 python Pull requests that update Python code size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants