Skip to content

Commit ff40a95

Browse files
Ngô Quang Hòaclaude
andcommitted
feat(observability): histogram buckets for ttft/tpot so they render as histograms
`smg_router_ttft_seconds` and `smg_router_tpot_seconds` are recorded via `histogram!()`, but the Prometheus exporter only renders a metric with `_bucket` series when buckets are registered for it. Buckets were registered only for the `duration_seconds` suffix (and the canary), so TTFT and TPOT fell back to summaries (quantile lines) — not heatmap-able in Grafana. Register the existing request-latency buckets for the `ttft_seconds` and `tpot_seconds` suffixes too. They span 0.001-7200s, fine for both sub-second-to-seconds TTFT and tens-of-ms per-request mean TPOT. No new buckets config, no behavior change to recording sites. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 5193c12 commit ff40a95

1 file changed

Lines changed: 12 additions & 0 deletions

File tree

model_gateway/src/observability/metrics.rs

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -358,10 +358,22 @@ pub fn start_prometheus(config: PrometheusConfig) -> PrometheusHandle {
358358
// render it as a summary.
359359
let canary_matcher = Matcher::Full(super::runtime_metrics::EVENT_LOOP_DELAY_SECONDS.into());
360360

361+
// TTFT and TPOT (per-request mean inter-token latency) end in `_seconds`
362+
// but NOT `duration_seconds`, so without explicit buckets the recorder
363+
// renders them as summaries (quantile lines only) — not heatmap-able. Reuse
364+
// the request-latency buckets: they span 0.001-7200s, fine for both the
365+
// sub-second-to-seconds TTFT and the tens-of-ms TPOT.
366+
let ttft_matcher = Matcher::Suffix(String::from("ttft_seconds"));
367+
let tpot_matcher = Matcher::Suffix(String::from("tpot_seconds"));
368+
361369
PrometheusBuilder::new()
362370
.upkeep_timeout(Duration::from_secs(UPKEEP_INTERVAL_SECS))
363371
.set_buckets_for_metric(duration_matcher, &duration_bucket)
364372
.expect("failed to set duration bucket")
373+
.set_buckets_for_metric(ttft_matcher, &duration_bucket)
374+
.expect("failed to set ttft bucket")
375+
.set_buckets_for_metric(tpot_matcher, &duration_bucket)
376+
.expect("failed to set tpot bucket")
365377
.set_buckets_for_metric(
366378
canary_matcher,
367379
super::runtime_metrics::EVENT_LOOP_DELAY_BUCKETS,

0 commit comments

Comments
 (0)