Skip to content

Add first_time_to_token attribute to LlmChatCompletionSummary #3581

@amychisholm03

Description

@amychisholm03

Description

“Time To First token” is an important and required metric to measure the model performance. We need to add this metric to AIM via LlmChatCompletionSummary.first_time_to_token.

  • This metric should only be captured for streaming requests
  • The metric name should be captured as LlmChatCompletionSummary.time_to_first_token
  • The metric value should be stored as the number of milliseconds between the time when the request was issued and the first token is received

Pending Questions to Clarify

  • Is this for all LLMs or just a select few?
  • What is the default/error value? 0, undefined?

Acceptance Criteria

  • first_time_to_token is added to the LlmChatCompletionSummary class via this['response.first_time_to_token']
  • this attribute can be seen in the UI
  • Appropriate tests are created

Additional context

Jira link

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Reviewed

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions