Skip to content

[processor/spanpruning] Add histogram support for latency distribution on aggregated summary spans#47604

Merged
andrzej-stencel merged 12 commits intoopen-telemetry:mainfrom
portertech:trace-span-pruning-histo
Apr 15, 2026
Merged

[processor/spanpruning] Add histogram support for latency distribution on aggregated summary spans#47604
andrzej-stencel merged 12 commits intoopen-telemetry:mainfrom
portertech:trace-span-pruning-histo

Conversation

@portertech
Copy link
Copy Markdown
Contributor

@portertech portertech commented Apr 13, 2026

Description

The span pruning processor currently emits min/max/avg/total duration statistics on summary spans, but these aggregates hide the shape of the latency distribution. A flat average can mask bimodal distributions and tail latency problems that are critical for debugging.

This PR adds configurable cumulative histogram buckets to summary spans, giving users latency distribution visibility without requiring a separate metrics pipeline. Bucket boundaries default to the OpenTelemetry SDK explicit bucket histogram boundaries ([5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s]) and can be customized or disabled entirely by setting aggregation_histogram_buckets to an empty list.

Summary spans gain two new slice attributes:

  • <prefix>histogram_bucket_bounds_s -- upper bounds in seconds (excludes +Inf)
  • <prefix>histogram_bucket_counts -- cumulative counts per bucket (includes +Inf)

Cumulative semantics were chosen to match Prometheus/OpenTelemetry histogram conventions, where each bucket count includes all observations at or below that boundary.

Testing

  • Unit tests for cumulative bucket counting logic and the disabled-histogram path (stats_test.go)
  • Config validation tests covering valid buckets, negative values, unsorted values, and empty list (config_test.go)
  • Integration tests exercising the full processor pipeline with histogram attributes enabled and disabled, asserting exact bucket bounds and counts on output summary spans (processor_test.go)
  • YAML unmarshal test verifying custom bucket configuration round-trips through TestLoadConfig

Documentation

  • Added aggregation_histogram_buckets to the configuration table in README
  • Added histogram_bucket_bounds_s and histogram_bucket_counts to the summary span attributes table
  • Added a "Histogram Buckets" section with a worked example showing cumulative bucket semantics

@portertech portertech marked this pull request as ready for review April 14, 2026 05:58
@portertech portertech requested a review from a team as a code owner April 14, 2026 05:58
@portertech portertech requested a review from ArthurSens April 14, 2026 05:58
Copy link
Copy Markdown
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You make me wish for exponential histogram support as an option! Not a blocker, but note that /pkg/expohisto now contains an implementation you could use for this.

@portertech
Copy link
Copy Markdown
Contributor Author

You make me wish for exponential histogram support as an option! Not a blocker, but note that /pkg/expohisto now contains an implementation you could use for this.

Oh neat! I already have a branch coming together with it 😆

@andrzej-stencel andrzej-stencel merged commit 7d140d1 into open-telemetry:main Apr 15, 2026
196 checks passed
@portertech portertech deleted the trace-span-pruning-histo branch April 15, 2026 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants