-
Notifications
You must be signed in to change notification settings - Fork 172
fix: litellm async calls with stream reports token usage and span status #2480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: litellm async calls with stream reports token usage and span status #2480
Conversation
|
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
|
@codefromthecrypt could I get some feedback for theses changes 😄 ? |
|
I don't have workflow approval, but I think the main step is to add unit test, or another row in the unit test, for this case. Main thing is you need to verify the changes are needed, esp the status thing. so you can look at litellm's existing there is some code in litellm for async, but basically always the openai code is in the best shape, as that's the most used. you can steal patterns from this python/instrumentation/openinference-instrumentation-openai/tests/openinference/instrumentation/openai/test_instrumentor.py then you need to run tests the same way they run in CI |
|
ps one tip is that to run the tests, enter the python directory and do |
8d4c769 to
78956e8
Compare
|
Thanks for the feedback @codefromthecrypt. I added tests 😄 and validated that are passing using |
|
@caroger would it be possible to get some feedback on these changes. Thanks in advance 😄 |
While testing locally Arize, I noticed that when using the
acompletionfunction withstream=True, stream_options={"include_usage": True}options, the reported traces did not included the count token, and the status was undefined.I chekced the code and noticed that the token count issue, was that we pass the
usage_statsobject to the_set_token_counts_from_usagedirectly. When the function checks if the object has theusageattribute, it return false and exit. Causing the span to not report token count.I modified the code so the object includes the
usageattribute, and called the_set_span_statusfunction.After those changes the token count and the status is reported:
Note
Ensure
acompletion/completionstreaming reports token counts and sets span status, and add tests covering async streaming and usage reporting.usagewithSimpleNamespace(usage=...)before calling_set_token_counts_from_usagein_finalize_sync_streaming_spanand_finalize_streaming_span._set_span_status(span, aggregated_output)after async streaming completes.acompletionvalidating output value, status OK, and token counts whenstream_options={"include_usage": True}.Written by Cursor Bugbot for commit 78956e8. This will update automatically on new commits. Configure here.