[processor/spanpruning] Optimize executeAggregations by reusing trace tree (#47771)

csmarchbanks · web-flow · commit 77bff7151a4e · 2026-04-20T16:56:58.000-04:00
&lt;!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.--&gt;
#### Description
Eliminate the parentReplacements and spansToRemove maps from
executeAggregations by leveraging the existing traceTree structure.
Parent replacement lookups now walk the tree's parent pointers via a new
replacementSpanID field on spanNode, and span removal uses the tree's
markedForRemoval flags in a single pass per ScopeSpans.

There are two benefits to this change:
1. Fix OOMs when a trace is fragmented across many scope spans. Right
now there is a bug where for each scope span a map of
`len(spans-in-aggregation)` is pre-allocated and added to
`spansToRemove`. By removing `spansToRemove` this is no longer done.
However, this could also be fixed by avoiding the pre-allocation.
2. Using the existing tree instead of the two maps is also 10-20%
faster, and reduces the amount of data allocated to process each trace.

```
                                 │    sec/op     │    sec/op     vs base                │
ProcessTrace_SmallTrace-8            8.219µ ± 7%   8.028µ ±  1%   -2.32% (p=0.000 n=10)
ProcessTrace_MediumTrace-8           72.36µ ± 7%   64.70µ ±  2%  -10.59% (p=0.000 n=10)
ProcessTrace_LargeTrace-8            739.1µ ± 1%   690.5µ ±  4%   -6.58% (p=0.000 n=10)
ProcessTrace_SparseAggregation-8     480.2µ ± 3%   488.4µ ±  1%   +1.72% (p=0.019 n=10)
DeepTrace_Depth1-8                   441.0µ ± 5%   389.1µ ±  4%  -11.75% (p=0.000 n=10)
DeepTrace_Depth5-8                   488.0µ ± 0%   428.6µ ±  4%  -12.18% (p=0.000 n=10)
DeepTrace_Depth10-8                  487.8µ ± 0%   417.2µ ±  8%  -14.48% (p=0.000 n=10)
ExecuteAggregations-8                16.97µ ± 2%   13.58µ ±  2%  -19.99% (p=0.000 n=10)

                                 │     B/op      │     B/op      vs base                  │
ProcessTrace_SmallTrace-8           13.92Ki ± 0%   13.80Ki ± 0%   -0.90% (p=0.000 n=10)
ProcessTrace_MediumTrace-8          119.9Ki ± 0%   113.5Ki ± 0%   -5.32% (p=0.000 n=10)
ProcessTrace_LargeTrace-8           1.184Mi ± 0%   1.075Mi ± 0%   -9.19% (p=0.000 n=10)
ProcessTrace_SparseAggregation-8    860.7Ki ± 0%   860.1Ki ± 0%   -0.07% (p=0.000 n=10)
DeepTrace_Depth1-8                  730.6Ki ± 0%   675.1Ki ± 0%   -7.59% (p=0.000 n=10)
DeepTrace_Depth5-8                  814.8Ki ± 0%   701.9Ki ± 0%  -13.86% (p=0.000 n=10)
DeepTrace_Depth10-8                 814.8Ki ± 0%   701.9Ki ± 0%  -13.86% (p=0.000 n=10)
ExecuteAggregations-8               20.24Ki ± 0%   18.97Ki ± 0%   -6.26% (p=0.000 n=10)
¹ all samples are equal

                                 │   allocs/op   │  allocs/op   vs base                 │
ProcessTrace_SmallTrace-8             204.0 ± 0%    202.0 ± 0%  -0.98% (p=0.000 n=10)
ProcessTrace_MediumTrace-8           1.508k ± 0%   1.493k ± 0%  -0.99% (p=0.000 n=10)
ProcessTrace_LargeTrace-8            13.80k ± 0%   13.77k ± 0%  -0.24% (p=0.000 n=10)
ProcessTrace_SparseAggregation-8     10.63k ± 0%   10.62k ± 0%  -0.07% (p=0.000 n=10)
DeepTrace_Depth1-8                   8.230k ± 0%   8.204k ± 0%  -0.32% (p=0.000 n=10)
DeepTrace_Depth5-8                   8.895k ± 0%   8.855k ± 0%  -0.45% (p=0.000 n=10)
DeepTrace_Depth10-8                  8.895k ± 0%   8.855k ± 0%  -0.45% (p=0.000 n=10)
ExecuteAggregations-8                 247.0 ± 0%    237.0 ± 0%  -4.05% (p=0.000 n=10)
```

&lt;!--Describe what testing was performed and which tests were added.--&gt;
#### Testing
All the existing tests pass, and we have verified the memory improvement
on a fragmented trace that was collected locally. This PR no longer OOMs
when we try to process that trace.
diff --git a/.chloggen/spanpruning-optimize-execute-aggregations.yaml b/.chloggen/spanpruning-optimize-execute-aggregations.yaml
@@ -0,0 +1,27 @@
+# Use this changelog template to create an entry for release notes.
+
+# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
+change_type: bug_fix
+
+# The name of the component, or a single word describing the area of concern, (e.g. receiver/filelog)
+component: processor/spanpruning
+
+# A brief description of the change.  Surround your text with quotes ("") if it needs to start with a backtick (`).
+note: Avoid excessive memory usage on large and fragmented traces
+
+# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
+issues: [47771]
+
+# (Optional) One or more lines of additional information to render under the primary note.
+# These lines will be padded with 2 spaces and then inserted directly into the document.
+# Use pipe (|) for multiline entries.
+subtext:
+
+# If your change doesn't affect end users or the exported elements of any package,
+# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
+# Optional: The change log or logs in which this entry should be included.
+# e.g. '[user]' or '[user, api]'
+# Include 'user' if the change is relevant to end users.
+# Include 'api' if there is a change to a library API.
+# Default: '[user]'
+change_logs: [user]
diff --git a/processor/spanpruningprocessor/aggregation.go b/processor/spanpruningprocessor/aggregation.go
@@ -77,52 +77,47 @@ func (*spanPruningProcessor) buildAggregationPlan(groups map[string]aggregationG
 	return aggregationPlan{groups: groupSlice}
 }
 
-// executeAggregations performs the top-down creation of summary spans, batch
-// removes originals, and returns the number of pruned spans.
-func (p *spanPruningProcessor) executeAggregations(plan aggregationPlan) int {
-	// Track which parent SpanID should map to which summary SpanID
-	parentReplacements := make(map[pcommon.SpanID]pcommon.SpanID, len(plan.groups)*4)
-
-	// Track spans to remove per ScopeSpans for batch removal
-	spansToRemove := make(map[ptrace.ScopeSpans]map[pcommon.SpanID]struct{}, len(plan.groups))
+// executeAggregations performs the top-down creation of summary spans, removes
+// originals using the tree's markedForRemoval flags, and returns the number of
+// pruned spans.
+func (p *spanPruningProcessor) executeAggregations(plan aggregationPlan, tree *traceTree) int {
 	prunedCount := 0
 
 	for i := range plan.groups {
 		group := &plan.groups[i]
 		// Calculate statistics and time range in single pass
 		data := p.calculateAggregationData(group.nodes)
 
-		// Determine the parent SpanID for the summary span
-		// Use the first node's parent as template
-		originalParentID := group.nodes[0].span.ParentSpanID()
-
-		// Check if the parent is being replaced by a summary span
-		summaryParentID := originalParentID
-		if replacementID, exists := parentReplacements[originalParentID]; exists {
-			summaryParentID = replacementID
+		// Determine the parent SpanID for the summary span.
+		// Walk the tree: if the parent node was already replaced by a summary
+		// span (from a higher-depth group), use that replacement ID.
+		summaryParentID := group.nodes[0].span.ParentSpanID()
+		if parentNode := group.nodes[0].parent; parentNode != nil && !parentNode.replacementSpanID.IsEmpty() {
+			summaryParentID = parentNode.replacementSpanID
 		}
 
 		// Create summary span with correct parent
 		p.createSummarySpanWithParent(*group, data, summaryParentID)
 
-		// Record that these original span IDs should be replaced by the summary span ID
+		// Record replacement span ID on each node so child groups can find it
 		for _, node := range group.nodes {
-			spanID := node.span.SpanID()
-			parentReplacements[spanID] = group.summarySpanID
-			scopeSpans := node.scopeSpans
-			if spansToRemove[scopeSpans] == nil {
-				spansToRemove[scopeSpans] = make(map[pcommon.SpanID]struct{}, len(group.nodes))
-			}
-			spansToRemove[scopeSpans][spanID] = struct{}{}
+			node.replacementSpanID = group.summarySpanID
 		}
 		prunedCount += len(group.nodes)
 	}
 
-	// Batch remove all marked spans in a single pass per ScopeSpans
-	for scopeSpans, spanIDs := range spansToRemove {
+	// Collect unique ScopeSpans that contain marked nodes, then remove in a
+	// single pass per ScopeSpans using the tree's flags set during analysis.
+	seen := make(map[ptrace.ScopeSpans]struct{})
+	for _, node := range tree.nodeByID {
+		if node.markedForRemoval {
+			seen[node.scopeSpans] = struct{}{}
+		}
+	}
+	for scopeSpans := range seen {
 		scopeSpans.Spans().RemoveIf(func(span ptrace.Span) bool {
-			_, shouldRemove := spanIDs[span.SpanID()]
-			return shouldRemove
+			n, ok := tree.nodeByID[span.SpanID()]
+			return ok && n.markedForRemoval
 		})
 	}
 
diff --git a/processor/spanpruningprocessor/processor.go b/processor/spanpruningprocessor/processor.go
@@ -144,7 +144,7 @@ func (p *spanPruningProcessor) processTrace(ctx context.Context, spans []spanInf
 	plan := p.buildAggregationPlan(aggregationGroups)
 
 	// Phase 3: Execute aggregations (top-down) and record pruned spans
-	prunedCount := p.executeAggregations(plan)
+	prunedCount := p.executeAggregations(plan, tree)
 
 	// Record telemetry after aggregation is complete
 	p.telemetryBuilder.ProcessorSpanpruningSpansPruned.Add(ctx, int64(prunedCount))
diff --git a/processor/spanpruningprocessor/processor_benchmark_test.go b/processor/spanpruningprocessor/processor_benchmark_test.go
@@ -132,7 +132,7 @@ func BenchmarkExecuteAggregations(b *testing.B) {
 		plan := proc.buildAggregationPlan(groups)
 
 		b.StartTimer()
-		proc.executeAggregations(plan)
+		proc.executeAggregations(plan, tree)
 		b.StopTimer()
 	}
 }
diff --git a/processor/spanpruningprocessor/tree.go b/processor/spanpruningprocessor/tree.go
@@ -12,13 +12,14 @@ import (
 // spanNode models a span in the trace tree with cached relationships and
 // aggregation bookkeeping.
 type spanNode struct {
-	span             ptrace.Span
-	scopeSpans       ptrace.ScopeSpans
-	parent           *spanNode
-	children         []*spanNode
-	groupKey         string // cached group key for leaf spans
-	isLeaf           bool   // true if node has no children
-	markedForRemoval bool   // true if node will be aggregated
+	span              ptrace.Span
+	scopeSpans        ptrace.ScopeSpans
+	parent            *spanNode
+	children          []*spanNode
+	groupKey          string         // cached group key for leaf spans
+	replacementSpanID pcommon.SpanID // summary span ID that replaced this node's group
+	isLeaf            bool           // true if node has no children
+	markedForRemoval  bool           // true if node will be aggregated
 }
 
 // traceTree holds span nodes indexed by ID plus quick leaf/orphan lists for

Original file line number	Diff line number	Diff line change
`@@ -132,7 +132,7 @@ func BenchmarkExecuteAggregations(b *testing.B) {`
`132`	`132`	`plan := proc.buildAggregationPlan(groups)`
`133`	`133`
`134`	`134`	`b.StartTimer()`
`135`		`- proc.executeAggregations(plan)`
	`135`	`+ proc.executeAggregations(plan, tree)`
`136`	`136`	`b.StopTimer()`
`137`	`137`	`}`
`138`	`138`	`}`