Add API response time monitoring per API client

**Current situation**: OTP exposes performance metrics for the TransModel API (latency breakdown per layer in the routing engine)
While this is useful for investigating performances issues, this cannot be used for monitoring service level objectives (SLO) per API consumers.

**Feature request**: a new prometheus-compatible metric that monitors percentile response time per API client.
- The recording is performed at the HTTP request level and does not provide a breakdown per internal layers within OTP.
- API clients are identified by a (configurable) HTTP header.
example "x-client-name", "et-client-name".
- To reduce the load on the metrics sub-system, metrics are exposed as (configurable) quantile buckets to be aggregated on the prometheus server side (Grafana, ...)
-  To prevent cardinality explosion, a (configurable) set of API clients are monitored individually. Unknown clients or requests that do not contain the client header are grouped under a common "other" category. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add API response time monitoring per API client #7149

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add API response time monitoring per API client #7149

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions