azure-ai-inference: token count attributes missing for streaming calls in traces

- **Package Name**: azure-ai-inference
- **Package Version**: 1.0.0b9
- **Operating System**: Linux
- **Python Version**: 3.11

**Describe the bug**
Token count attributes are missing in the AI trace for streaming chat completion calls.
![Image](https://github.com/user-attachments/assets/2d8acd33-3620-471c-8f6b-5523c51925b4)

I can see token count if change the LLM call to no streaming
![Image](https://github.com/user-attachments/assets/6de61522-4b29-46c5-a877-9d5b8a9a5f43)

After consulting with Jarno/Marko from the SDK team, they have this finding --> seems that the token count in the streaming case is not provided by the openai based models that I tested. I tested a non-openai based model and in that case the token count is reported.

**To Reproduce**
Steps to reproduce the behavior:
1. Enable tracing
2. Issue a streaming chat completion call to an openai model (e.g. gpt-4o-mini)
3. Check the app insights logs
4. Observe that token count attributes are missing

**Expected behavior**
Token count attributes should be present.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

azure-ai-inference: token count attributes missing for streaming calls in traces #40113

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

azure-ai-inference: token count attributes missing for streaming calls in traces #40113

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions