[exporter/loadbalancing] Reduce allocations on the traceID routing path#48983
[exporter/loadbalancing] Reduce allocations on the traceID routing path#48983paulojmdias wants to merge 7 commits into
Conversation
…ting hot path Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Optimizes load balancing trace export for traceID routing by avoiding the SplitTraces + merge round-trip, and refactors backend export/telemetry handling.
Changes:
- Add fast-path span routing for
traceID(consumeTracesByID) and refactor backend exporting intoexportToBackend. - Add
NumBackends()to the load balancer to support pre-sizing backend maps. - Enhance benchmark reporting by enabling allocation reporting.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| exporter/loadbalancingexporter/trace_exporter_test.go | Adds allocation reporting to an existing benchmark to better quantify performance changes. |
| exporter/loadbalancingexporter/trace_exporter.go | Introduces traceID fast-path routing, factors backend export+telemetry into a helper, and adjusts exporter selection logic. |
| exporter/loadbalancingexporter/loadbalancer.go | Adds a concurrency-safe NumBackends() accessor for backend count. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>
…try-collector-contrib into perf/loadbalancer Signed-off-by: Paulo Dias <paulodias.gm@gmail.com>
| d.rsIdx, d.ssIdx, d.active = i, j, true | ||
| } | ||
|
|
||
| span.CopyTo(d.curSS.Spans().AppendEmpty()) |
There was a problem hiding this comment.
Non-blocking: this preserves the routed spans per backend, but not the exact pdata grouping shape from the previous SplitTraces + mergeTraces path. Is this right? If so, multiple trace IDs for the same backend/source resource/scope can be appended into the same ScopeSpans, while the old path created separate ResourceSpans/ScopeSpans groups per trace ID before merging. That seems semantically fine, but the PR description should probably say "same per-backend span sets" rather than imply identical output shape.
There was a problem hiding this comment.
That's right 👍
I changed the PR description to say "same per-backend span sets" rather than imply identical output shape.
Description
While discussing an issue in a high-span-load environment, I noticed the trace exporter's
traceIDrouting path (the default) is hotter than I think it needs to be. EveryConsumeTracescall runsbatchpersignal.SplitTraces, allocating oneptrace.Tracesper trace ID and deep-copying every span before merging them back per backend.So I'm proposing this PR (sorry for not opening an issue first, but I'm open to doing it for discussion), which adds a fast path for
traceIDrouting that makes a single pass over the spans, routes each by its trace ID, and accumulates them directly into oneptrace.Tracesper backend.Routing decisions and per-backend span sets are identical to the previous path; only the pdata grouping differs (spans are regrouped per backend, so the exact ResourceSpans/ScopeSpans layout is not preserved). Each span is copied once and one ptrace.Traces is allocated per backend instead of one per trace.
Testing
Added
b.ReportAllocs()toBenchmarkConsumeTracesand below are the results with benchstat comparison:goos: darwin goarch: arm64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/exporter/loadbalancingexporter cpu: Apple M3 Pro │ before_lb.txt │ after_lb_2.txt │ │ sec/op │ sec/op vs base │ MergeTraces_X100-12 3.965n ± 2% 4.353n ± 5% +9.79% (p=0.000 n=10) MergeTraces_X500-12 3.948n ± 2% 4.271n ± 2% +8.20% (p=0.000 n=10) MergeTraces_X1000-12 3.946n ± 2% 4.283n ± 4% +8.51% (p=0.000 n=10) ConsumeTraces_1E100T-12 51.69µ ± 2% 22.55µ ± 13% -56.37% (p=0.000 n=10) ConsumeTraces_1E1000T-12 499.3µ ± 2% 206.2µ ± 4% -58.71% (p=0.000 n=10) ConsumeTraces_5E100T-12 56.15µ ± 4% 24.95µ ± 3% -55.56% (p=0.000 n=10) ConsumeTraces_5E500T-12 261.8µ ± 2% 109.7µ ± 4% -58.11% (p=0.000 n=10) ConsumeTraces_5E1000T-12 514.1µ ± 2% 214.0µ ± 9% -58.37% (p=0.000 n=10) ConsumeTraces_10E100T-12 56.92µ ± 2% 28.32µ ± 10% -50.24% (p=0.000 n=10) ConsumeTraces_10E500T-12 262.3µ ± 1% 123.3µ ± 9% -52.98% (p=0.000 n=10) ConsumeTraces_10E1000T-12 519.5µ ± 7% 246.4µ ± 8% -52.57% (p=0.000 n=10) geomean 9.939µ 5.648µ -43.17% │ before_lb.txt │ after_lb_2.txt │ │ B/op │ B/op vs base │ MergeTraces_X100-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ MergeTraces_X500-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ MergeTraces_X1000-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ConsumeTraces_1E100T-12 87.16Ki ± 0% 49.35Ki ± 0% -43.38% (p=0.000 n=10) ConsumeTraces_1E1000T-12 856.3Ki ± 0% 486.2Ki ± 0% -43.22% (p=0.000 n=10) ConsumeTraces_5E100T-12 88.27Ki ± 0% 50.88Ki ± 0% -42.36% (p=0.000 n=10) ConsumeTraces_5E500T-12 432.3Ki ± 0% 246.6Ki ± 0% -42.96% (p=0.000 n=10) ConsumeTraces_5E1000T-12 862.2Ki ± 0% 492.4Ki ± 0% -42.89% (p=0.000 n=10) ConsumeTraces_10E100T-12 88.31Ki ± 0% 51.91Ki ± 0% -41.21% (p=0.000 n=10) ConsumeTraces_10E500T-12 433.6Ki ± 0% 248.8Ki ± 0% -42.63% (p=0.000 n=10) ConsumeTraces_10E1000T-12 864.6Ki ± 0% 495.8Ki ± 0% -42.65% (p=0.000 n=10) geomean ² -33.27% ² ¹ all samples are equal ² summaries must be >0 to compute geomean │ before_lb.txt │ after_lb_2.txt │ │ allocs/op │ allocs/op vs base │ MergeTraces_X100-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ MergeTraces_X500-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ MergeTraces_X1000-12 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ConsumeTraces_1E100T-12 1419.0 ± 0% 617.0 ± 0% -56.52% (p=0.000 n=10) ConsumeTraces_1E1000T-12 14.025k ± 0% 6.020k ± 0% -57.08% (p=0.000 n=10) ConsumeTraces_5E100T-12 1451.0 ± 0% 663.0 ± 0% -54.31% (p=0.000 n=10) ConsumeTraces_5E500T-12 7.061k ± 0% 3.071k ± 0% -56.51% (p=0.000 n=10) ConsumeTraces_5E1000T-12 14.066k ± 0% 6.075k ± 0% -56.81% (p=0.000 n=10) ConsumeTraces_10E100T-12 1460.0 ± 0% 690.0 ± 0% -52.74% (p=0.000 n=10) ConsumeTraces_10E500T-12 7.073k ± 0% 3.101k ± 0% -56.16% (p=0.000 n=10) ConsumeTraces_10E1000T-12 14.079k ± 0% 6.106k ± 0% -56.63% (p=0.000 n=10) geomean ² -44.84% ²Authorship