You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[processor/spanpruning] Add histogram support for latency distribution on aggregated summary spans (#47604)
#### Description
The span pruning processor currently emits min/max/avg/total duration
statistics on summary spans, but these aggregates hide the shape of the
latency distribution. A flat average can mask bimodal distributions and
tail latency problems that are critical for debugging.
This PR adds configurable cumulative histogram buckets to summary spans,
giving users latency distribution visibility without requiring a
separate metrics pipeline. Bucket boundaries default to the
OpenTelemetry SDK explicit bucket histogram boundaries (`[5ms, 10ms,
25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s]`) and can be
customized or disabled entirely by setting
`aggregation_histogram_buckets` to an empty list.
Summary spans gain two new slice attributes:
- `<prefix>histogram_bucket_bounds_s` -- upper bounds in seconds
(excludes +Inf)
- `<prefix>histogram_bucket_counts` -- cumulative counts per bucket
(includes +Inf)
Cumulative semantics were chosen to match Prometheus/OpenTelemetry
histogram conventions, where each bucket count includes all observations
at or below that boundary.
#### Testing
- Unit tests for cumulative bucket counting logic and the
disabled-histogram path (`stats_test.go`)
- Config validation tests covering valid buckets, negative values,
unsorted values, and empty list (`config_test.go`)
- Integration tests exercising the full processor pipeline with
histogram attributes enabled and disabled, asserting exact bucket bounds
and counts on output summary spans (`processor_test.go`)
- YAML unmarshal test verifying custom bucket configuration round-trips
through `TestLoadConfig`
#### Documentation
- Added `aggregation_histogram_buckets` to the configuration table in
README
- Added `histogram_bucket_bounds_s` and `histogram_bucket_counts` to the
summary span attributes table
- Added a "Histogram Buckets" section with a worked example showing
cumulative bucket semantics
When `aggregation_histogram_buckets` is configured, summary spans include latency distribution data as cumulative histogram buckets. Cumulative means each bucket count includes all spans with duration less than or equal to that bucket boundary.
144
+
145
+
**Worked example** with buckets `[10ms, 50ms, 100ms]` and span durations `[5ms, 15ms, 25ms, 75ms, 150ms]`:
0 commit comments