Skip to content

[processor/spanpruning] Add bytes metrics#49008

Open
portertech wants to merge 14 commits into
open-telemetry:mainfrom
portertech:bytes-metrics
Open

[processor/spanpruning] Add bytes metrics#49008
portertech wants to merge 14 commits into
open-telemetry:mainfrom
portertech:bytes-metrics

Conversation

@portertech

@portertech portertech commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Add serialized bytes metrics to span pruning processor

Description

This PR adds opt-in serialized byte-size telemetry to the span pruning processor, enabling users to measure the data-volume impact of pruning rather than just span counts. It builds on the existing core pruning, parent recursion, and aggregation metrics.

Changes

New feature: Bytes Metrics

Span-count metrics show how many spans were pruned, but not how much serialized data that saved. This feature serializes trace data with ptrace.ProtoMarshaler and reports the byte sizes before and after pruning, for both the full batch and the matched subset of traces.

Visibility is provided through four counter metrics (unit By, monotonic, disabled by default):

  • processor_spanpruning_bytes_received - full batch size before pruning
  • processor_spanpruning_bytes_processed_input - matched subset size before pruning
  • processor_spanpruning_bytes_processed_output - matched subset size after pruning
  • processor_spanpruning_bytes_emitted - full batch size after pruning

Implementation notes:

  • The feature is opt-in because measuring it serializes each batch (full batch plus the matched subset before/after pruning), which adds CPU overhead and is expensive for large batches.
  • spanInfo now tracks its ResourceSpans (in addition to ScopeSpans) so the new getBytes helper can reconstruct the original RS/SS hierarchy when sizing a subset of traces.
  • A matchedTraces set is wired into processTraces as a stand-in for future OTTL conditions filtering. Until conditions are supported in contrib, every trace is treated as matched, so bytes_processed_input == bytes_received and bytes_processed_output == bytes_emitted. The split is in place so the matched-subset metrics become meaningful once condition filtering lands.

Configuration options:

processors:
  spanpruning:
    # Enable measurement of serialized trace sizes before and after pruning.
    # Serializes each batch (full batch plus matched subset) and is expensive
    # for large batches.
    enable_bytes_metrics: true   # default: false

When enable_bytes_metrics is false, behavior is unchanged and no bytes metrics are emitted.

Testing

  • Processor tests asserting the bytes metrics are recorded only when enabled and absent when disabled
  • Tests verifying bytes_processed_input == bytes_received and bytes_processed_output == bytes_emitted when no conditions are configured
  • Tests verifying input/output bytes match when no aggregation runs, and that all counters drop when pruning reduces serialized size
  • Updated the generated telemetry tests (metadatatest)

Documentation

Updated README with the enable_bytes_metrics option, the four new metrics, and a "Byte metrics semantics" section explaining valid comparisons now and the forward-looking behavior once conditions support is added (including the case where aggregation summaries can make bytes_processed_output exceed bytes_processed_input); added the four counters to metadata.yaml and regenerated the telemetry code.

@portertech portertech marked this pull request as ready for review June 11, 2026 13:49
@portertech portertech requested a review from a team as a code owner June 11, 2026 13:49
@portertech portertech requested a review from VihasMakwana June 11, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants