Component donation: Span Pruning Processor

### Building and distributing outside this repository

- [x] I understand I can use and distribute my component outside of this repository.

### Existing component implementation

https://github.com/grafana/opentelemetry-collector-extras

### Components covering similar use cases

| Component | Relationship |
|-----------|--------------|
| [tailsamplingprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor) | Reduces trace volume by sampling complete traces based on policies. Span pruning is complementary—it reduces volume *within* traces by aggregating repetitive spans rather than dropping entire traces. |
| [filterprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/filterprocessor) | Drops entire spans/metrics/logs matching criteria. Span pruning preserves information by aggregating rather than dropping. |
| [groupbytraceprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/groupbytraceprocessor) | Groups spans by trace ID (often used upstream of tail sampling). Span pruning operates on complete traces and would typically be placed after groupbytrace. |
| [transformprocessor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/transformprocessor) | Modifies telemetry using OTTL. Could theoretically implement some aggregation logic, but span pruning provides purpose-built aggregation with statistics. |

No direct equivalent exists for aggregating similar leaf spans within a trace while preserving statistical information.


### The purpose and use-cases of the component

#### Problem Statement

Modern distributed applications often generate traces with many repetitive leaf spans—N+1 database queries, batch HTTP calls, fan-out operations. These patterns create:

- **High storage costs** from redundant span data
- **Noisy trace views** that obscure the actual request flow
- **Cardinality pressure** on backends

#### Solution

The **spanpruningprocessor** identifies similar leaf spans within a trace and replaces each group with a single summary span containing aggregated statistics. This reduces trace volume while preserving meaningful operational information.

#### Key Features

| Feature | Description |
|---------|-------------|
| **Leaf span aggregation** | Groups spans by name, kind, status, parent, and configurable attributes (glob patterns supported) |
| **Recursive parent aggregation** | When all children of a parent are aggregated, the parent can also be aggregated (configurable depth) |
| **Statistical summaries** | Summary spans include count, min/max/avg/total duration, and histogram buckets |
| **Outlier detection** | IQR or MAD-based detection identifies slow spans; optionally correlates outliers with attribute values |
| **Outlier preservation** | Keeps outlier spans as individuals for debugging while aggregating normal spans |
| **Attribute loss tracking** | Optional metrics and annotations showing what attribute diversity is lost during aggregation |

#### Use Cases

1. **Database query optimization** - Aggregate N+1 query patterns into a single summary showing query count and latency distribution
2. **Batch operations** - Consolidate fan-out calls (parallel HTTP requests, message publishes) into representative summaries
3. **Cost reduction** - Reduce trace storage costs for repetitive workloads while retaining operational insight
4. **Debugging support** - Preserve outlier spans with correlated attributes to identify root causes of latency issues

#### Example

**Before** (10 similar SELECT spans):
```
handler
├── SELECT - 5ms
├── SELECT - 6ms
├── SELECT - 7ms
├── SELECT - 500ms (outlier, cache_hit=false)
└── ... (6 more)
```

**After** (with outlier preservation):
```
handler
├── SELECT (summary, span_count=9)
│   ├── aggregation.duration_avg_ns: 8000000
│   ├── aggregation.duration_median_ns: 7000000
│   ├── aggregation.outlier_correlated_attributes: "cache_hit=false(100%/0%)"
│   └── aggregation.histogram_bucket_counts: [3, 6, 8, 9]
└── SELECT - 500ms (preserved outlier)
    ├── aggregation.is_preserved_outlier: true
    └── cache_hit: false
```

### Example configuration for the component

```yaml
processors:
  spanpruning:
    # Attribute patterns for grouping (glob syntax)
    group_by_attributes:
      - "db.*"
      - "http.method"
    
    # Minimum spans required before aggregation (default: 5)
    min_spans_to_aggregate: 5
    
    # Parent aggregation depth: 0=none, -1=unlimited (default: 1)
    max_parent_depth: 1
    
    # Prefix for aggregation attributes (default: "aggregation.")
    aggregation_attribute_prefix: "aggregation."
    
    # Histogram bucket bounds (default: 5ms to 10s)
    aggregation_histogram_buckets: [5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s]
    
    # Enable attribute loss analysis (default: false)
    enable_attribute_loss_analysis: false
    
    # Enable outlier detection (default: false)
    enable_outlier_analysis: false
    
    # Outlier analysis settings
    outlier_analysis:
      method: iqr                    # or "mad"
      iqr_multiplier: 1.5
      mad_multiplier: 3.0
      min_group_size: 7
      correlation_min_occurrence: 0.75
      correlation_max_normal_occurrence: 0.25
      max_correlated_attributes: 5
      preserve_outliers: false
      max_preserved_outliers: 2
      preserve_only_with_correlation: false
```

### Telemetry data types supported

• Traces (alpha stability)


### Code Owners

@portertech, @csmarchbanks, @sherinabr

### Sponsor

@jmacd 

### Additional context

- **Pull Request**: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/45617
- **Test Coverage**: >80% with unit tests, benchmarks, and integration tests
- **Documentation**: Comprehensive README with configuration reference and examples

### Tip

<sub>[React](https://github.blog/news-insights/product-news/add-reactions-to-pull-requests-issues-and-comments/) with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding `+1` or `me too`, to help us triage it. Learn more [here](https://opentelemetry.io/community/end-user/issue-participation/).</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Component donation: Span Pruning Processor #45654

Building and distributing outside this repository

Existing component implementation

Components covering similar use cases

The purpose and use-cases of the component

Problem Statement

Solution

Key Features

Use Cases

Example

Example configuration for the component

Telemetry data types supported

Code Owners

Sponsor

Additional context

Tip

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Component	Relationship
tailsamplingprocessor	Reduces trace volume by sampling complete traces based on policies. Span pruning is complementary—it reduces volume within traces by aggregating repetitive spans rather than dropping entire traces.
filterprocessor	Drops entire spans/metrics/logs matching criteria. Span pruning preserves information by aggregating rather than dropping.
groupbytraceprocessor	Groups spans by trace ID (often used upstream of tail sampling). Span pruning operates on complete traces and would typically be placed after groupbytrace.
transformprocessor	Modifies telemetry using OTTL. Could theoretically implement some aggregation logic, but span pruning provides purpose-built aggregation with statistics.

Feature	Description
Leaf span aggregation	Groups spans by name, kind, status, parent, and configurable attributes (glob patterns supported)
Recursive parent aggregation	When all children of a parent are aggregated, the parent can also be aggregated (configurable depth)
Statistical summaries	Summary spans include count, min/max/avg/total duration, and histogram buckets
Outlier detection	IQR or MAD-based detection identifies slow spans; optionally correlates outliers with attribute values
Outlier preservation	Keeps outlier spans as individuals for debugging while aggregating normal spans
Attribute loss tracking	Optional metrics and annotations showing what attribute diversity is lost during aggregation

Component donation: Span Pruning Processor #45654

Description

Building and distributing outside this repository

Existing component implementation

Components covering similar use cases

The purpose and use-cases of the component

Problem Statement

Solution

Key Features

Use Cases

Example

Example configuration for the component

Telemetry data types supported

Code Owners

Sponsor

Additional context

Tip

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions