Skip to content

feat: add Prometheus & Grafana Monitoring for LLMs Using OpenTelemetry #414

@akotyla

Description

@akotyla

Feature description

Integrate Prometheus and Grafana with our OpenTelemetry audit to track key performance and usage metrics. This will allow us to collect, visualize, and analyze model behavior in real-time, ensuring better observability and debugging capabilities. The following metrics should be captured: prompt throughput, token throughput, time to first token, input tokens.

Motivation

By adding OpenTelemetry-based metrics collection and integrating it with Prometheus and Grafana, we can monitor request latency, token usage, and system throughput and set up alerts for anomalies to improve system reliability.

Additional context

No response

Metadata

Metadata

Assignees

Labels

featureNew feature or request

Projects

Status

In review

Relationships

None yet

Development

No branches or pull requests

Issue actions