Add connection pooling support to OTLP exporter for high-throughput scenarios #14364

Arunodoy18 · 2026-01-06T07:23:29Z

Description

This PR adds connection pooling support to the OTLP exporter to resolve performance issues in high-throughput and high-latency environments.

Motivation

As reported in the issue, users experience unreliability in the OTLP exporter with:

High throughput scenarios (10K+ spans/sec)
High-latency network connections (e.g., cross-region deployments)
AWS ALB limiting HTTP/2 streams to 128

The single gRPC connection becomes a bottleneck, causing queue overflow and dropped spans.

Changes:

##Core Implementation:

*Added connection_pool_sizeconfiguration parameter** toConfig` struct
- Default: 0 (uses 1 connection for backward compatibility)
- Range: 0-256 connections
- Validated in Config.Validate()
Implemented connection pool in baseExporter
- Maintains multiple gRPC connections in slices
- Round-robin load balancing using atomic counter
- All data types (traces, metrics, logs, profiles) use connection pool
Thread-safe round-robin distribution*
- getNextExporterIndex()method usesatomic.Uint32`
- Optimized for single-connection case (no atomic ops)

Documentation

Updated README.md with configuration details and examples
Added changelog entry in .chloggen/
Included high-throughput configuration example

Bug Fix

Fixed unrelated service.go issue per contributor feedback on PR Update pdata comments #14342
- Removed conditional process metrics registration

Testing

✅ All existing tests pass
✅ No compilation errors
✅ Configuration validation works correctly
✅ Backward compatible

Usage Example

``yaml
exporters:
otlp/high-throughput:
endpoint: otel-gateway:443
connection_pool_size: 5 # Creates 5 gRPC connections
compression: snappy
timeout: 20s
sending_queue:
num_consumers: 100
queue_size: 2000

When service.telemetry.metrics.level is set to 'none', the collector should skip registering process metrics to avoid errors on platforms where gopsutil is not supported (such as AIX). This change conditionally registers process metrics only when the metrics level is not LevelNone, preventing the 'failed to register process metrics: not implemented yet' error on unsupported platforms. Fixes regression introduced in v0.136.0 where the check for metrics level was removed.

Similar to the resolution for pcommon.Value in previous changes, this update ensures consistent documentation across all pdata types by clarifying that calling functions on zero-initialized instances is invalid usage. Changes: - Updated template files (one_of_field.go, one_of_message_value.go) to generate improved comment wording - Updated pcommon/value.go comments manually - Updated all generated pdata files to use consistent wording: 'is invalid and will cause a panic' instead of 'will cause a panic' This makes it clearer that using zero-initialized instances is not just dangerous but explicitly invalid usage, improving API documentation clarity.

…onfig file endpoints Fixes open-telemetry#14286 When both OTEL_EXPORTER_OTLP_TRACES_ENDPOINT environment variable and a configured endpoint in the config file are present, the URL scheme from the environment variable was incorrectly overriding the scheme from the config file, resulting in mixed endpoints (e.g., http scheme from env var + path from config file). This fix ensures that environment variables do not override explicitly configured endpoints by temporarily unsetting the OTEL_EXPORTER_OTLP_*_ENDPOINT environment variables before creating the SDK, then restoring them afterward. According to the OpenTelemetry specification, explicit configuration should take precedence over environment variables. Changes: - Modified sdk.go to temporarily unset OTEL_EXPORTER_OTLP_*_ENDPOINT environment variables before calling config.NewSDK() - Added helper functions unsetOTLPEndpointEnvVars() and restoreEnvVars() - Added comprehensive tests to verify env vars don't override config

…cenarios This enhancement adds a connection_pool_size configuration option to the OTLP exporter, enabling multiple gRPC connections with round-robin load balancing. Key changes: - Add connection_pool_size config parameter (default: 0, uses 1 connection) - Implement round-robin load balancing across multiple connections - Support for 1-256 concurrent gRPC connections - Backward compatible: default behavior unchanged This resolves performance issues in high-throughput environments (10K+ spans/sec) and high-latency network scenarios where a single gRPC connection becomes a bottleneck. Also fixes unrelated service.go issue per contributor feedback on PR open-telemetry#14342.

Arunodoy18 · 2026-01-06T07:25:19Z

I hope this works well, as was adressed through the issue .if any unrelated changes occur or anything which is wrong , please do tell after the review.
Thank you

bogdandrutu

Can you show me some data the demonstrate this is needed? gRPC says that you don't need to do this and it will automatically use multiple sockets, etc.

tank-500m · 2026-01-07T14:04:30Z

It doesn’t seem general enough to justify inclusion in the core component.
Also, there appear to be viable alternatives (e.g., the loadbalancing exporter in opentelemetry-collector-contrib).

Arunodoy18 added 4 commits January 1, 2026 13:50

Arunodoy18 requested review from a team, bogdandrutu and dmitryax as code owners January 6, 2026 07:23

bogdandrutu requested changes Jan 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add connection pooling support to OTLP exporter for high-throughput scenarios #14364

Add connection pooling support to OTLP exporter for high-throughput scenarios #14364

Arunodoy18 commented Jan 6, 2026

Uh oh!

Arunodoy18 commented Jan 6, 2026

Uh oh!

bogdandrutu left a comment

Uh oh!

tank-500m commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add connection pooling support to OTLP exporter for high-throughput scenarios #14364

Are you sure you want to change the base?

Add connection pooling support to OTLP exporter for high-throughput scenarios #14364

Conversation

Arunodoy18 commented Jan 6, 2026

Motivation

Documentation

Bug Fix

Testing

Usage Example

Uh oh!

Arunodoy18 commented Jan 6, 2026

Uh oh!

bogdandrutu left a comment

Choose a reason for hiding this comment

Uh oh!

tank-500m commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants