Open
Description
- Package Name: azure-ai-inference
- Package Version: 1.0.0b9
- Operating System: Linux
- Python Version: 3.11
Describe the bug
Token count attributes are missing in the AI trace for streaming chat completion calls.
I can see token count if change the LLM call to no streaming
After consulting with Jarno/Marko from the SDK team, they have this finding --> seems that the token count in the streaming case is not provided by the openai based models that I tested. I tested a non-openai based model and in that case the token count is reported.
To Reproduce
Steps to reproduce the behavior:
- Enable tracing
- Issue a streaming chat completion call to an openai model (e.g. gpt-4o-mini)
- Check the app insights logs
- Observe that token count attributes are missing
Expected behavior
Token count attributes should be present.