Skip to content

Initial implementation of OTLP HTTP Exporter#2070

Merged
albertlockett merged 26 commits intoopen-telemetry:mainfrom
albertlockett:albert/1145
Feb 20, 2026
Merged

Initial implementation of OTLP HTTP Exporter#2070
albertlockett merged 26 commits intoopen-telemetry:mainfrom
albertlockett:albert/1145

Conversation

@albertlockett
Copy link
Copy Markdown
Member

Change Summary

This PR adds a new exporter for exporting telemetry via OTLP over HTTP.

Many of the same patterns from the OTLP gRPC exporter are followed, including the optimizations that were added in #1474.

This PR adds a new otlp_http::client_settings to the otap crate, which contains HttpClientSettings, a reusable configuration for an HTTP client, which can produce a configured reqwest::ClientBuilder. This is analogous to the otlp_grpc::client_settings::GrpcClientSettings.

The exporter uses the otlp_exporter:: InFlightExports type for managing a limited number of concurrent in flight requests and, and handling request completions.

The exporter uses a pool of request::Clients, implemented as the HttpClientPool type. The intention is to force requests to be distributed over multiple connections to the OTLP server. In our cast, there may be many servers listening on the same port using SOREUSEPORT (for example, one per instance of a pipeline, each on different cores). The intention is to hopefully have better load balancing. When using HTTP version 1, this is likely unnecessary as with a high enough concurrent request volume, multiple connections will be opened anyway by the internal reqwest Client connection pool. However, this will likely be needed when we add TLS because HTTP version 2 may be negotiated, in which case multiple requests can be multiplexed over the same connection, leading to the same issue we had in the OTLP gRPC exporter (which also uses a client pool).

There are a variety of additional tasks that I will handle in followup PRs:

  • TLS/mTLS support
  • Proxy support
  • Compression - compressed request bodies / accepting compressed responses
  • Allow endpoint overrides for each signal type (similar to Go collector implementation)
  • Unit test metrics produced by component

What issue does this PR close?

How are these changes tested?

A suite of unit tests covering successful requests both OTLP & OTAP pdatas, plus tests covering various errors including connection refused, non 200 response, OTLP service responses w/ partial success and malformed OTAP batches.

Are there any user-facing changes?

Users now have a new component available for their telemetry pipelines.

@github-actions github-actions Bot added the rust Pull requests that update Rust code label Feb 19, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 19, 2026

Codecov Report

❌ Patch coverage is 93.95307% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.04%. Comparing base (3d35559) to head (d502fcb).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2070      +/-   ##
==========================================
+ Coverage   86.97%   87.04%   +0.06%     
==========================================
  Files         539      542       +3     
  Lines      173124   174232    +1108     
==========================================
+ Hits       150571   151653    +1082     
- Misses      22019    22045      +26     
  Partials      534      534              
Components Coverage Δ
otap-dataflow 89.18% <93.95%> (+0.07%) ⬆️
query_abstraction 80.61% <ø> (ø)
query_engine 90.33% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 53.50% <ø> (ø)
quiver 91.73% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@albertlockett
Copy link
Copy Markdown
Member Author

Keeping this as Draft for now, since it seems to be breaking the pipeline perf tests

Copy link
Copy Markdown
Contributor

@lquerel lquerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve done a first pass on this draft. No big issue, just few suggestions here and there.

Comment thread rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs Outdated
Comment on lines +426 to +431
Ok(service_resp) => service_resp.partial_success.map(|partial_success| {
format!(
"{} ({} rejected)",
partial_success.error_message, partial_success.rejected
)
}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the right approach but I'm not sure.
I'm wondering if a partial success should alway result into a Nack (as in this impl). Do you know what the Go collector do in this case?

Copy link
Copy Markdown
Member

@lalitb lalitb Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the specs -

If the request is only partially accepted (i.e. when the server accepts only parts of the data and rejects the rest), the server response MUST be the same ExportServiceResponse message as in the Full Success case.

Sending a Nack here will make the upstream retry the full batch.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops yes you're right - we shouldn't send the Nack here because we don't want to retry. the spec also says:
https://opentelemetry.io/docs/specs/otlp/#partial-success-1

The client MUST NOT retry the request when it receives a partial success response where the partial_success

Looks like there are also semantics for when to retry:

The server SHOULD use HTTP response status codes to indicate retryable and not-retryable errors for a particular erroneous situation. The client SHOULD honour HTTP response status codes as retryable or not-retryable.

Retryable Response Codes
The requests that receive a response status code listed in following table SHOULD be retried. All other 4xx or 5xx response status codes MUST NOT be retried.

HTTP response status code
429 Too Many Requests
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout

I'll fix this

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lalitb & @lquerel I've updated the way we handle errors/produce Nacks in 3bb9a33

The behaviour is we receive a 200 response code w/ a partial request we emit a Nack, unless rejected signals == 0. This Nack will have permanent = true, meaning the payload should not be retried.

When partial success rejects 0 signals, we treat this as a success. According to the spec, the server send a partial_success even if it fully accepted the request:

servers MAY also use the partial_success field to convey warnings/suggestions to clients even when it fully accepts the request. In such cases, the rejected_ field MUST have a value of 0, and the error_message field MUST be non-empty.

In the case where we receive a 200 response, but couldn't decode the body, we also send a Nack with permanent=true. The assumption here is that either there was a success, or a partial success, but either way we're not supposed to retry the request.

When we receive a non-200 response, we now follow the suggestions in the spec vis-à-vis retries: we send a Nack, and IF the response status wasn't 429, 502, 503 or 504, we set permanent=true meaning don't retry it.

For non-http errors, the spec only gives some limited clarity about what to do:
https://opentelemetry.io/docs/specs/otlp/#all-other-responses

All other HTTP responses that are not explicitly listed in this document should be treated according to HTTP specifications.

If the server disconnects without returning a response, the client SHOULD retry and send the same request. The client SHOULD implement an exponential backoff strategy between retries to avoid overwhelming the server.

The behaviour I have for these is to retry (e.g. send Nack w/ permanent=false) if the error was a connection error or a TCP timeout. My thinking was that it's reasonable to retry in these cases which may be intermittent networking errors.

Comment thread rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs Outdated
Comment thread rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs Outdated
Comment thread rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs Outdated
self.pdata_metrics.add_failed(signal_type, 1);
continue;
} else {
Bytes::copy_from_slice(proto_buffer.as_ref())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same apply here I think

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do something similar to what is suggest here.

Internally this proto_buffer has a Vec, not Bytes.

#[derive(Debug, Default)]
pub struct ProtoBuffer {
buffer: Vec<u8>,
}

We also clear and reuse this instance for each OTAP batch that we encode to proto.

Comment thread rust/otap-dataflow/crates/otap/Cargo.toml
&effect_handler,
&mut self.pdata_metrics,
)
.await;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The drain loop ignores the deadline already in scope. With http.timeout = None (default), a slow/hung server will block shutdown indefinitely past the deadline Worth racing next_completion() against sleep_until(deadline) here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #2099 to track this. OTLP gRPC exporter has the same behaviour

Comment thread rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs
Comment thread rust/otap-dataflow/crates/otap/src/otlp_http_exporter/config.rs
Comment thread rust/otap-dataflow/crates/pdata/src/otlp/mod.rs Outdated
@lalitb
Copy link
Copy Markdown
Member

lalitb commented Feb 20, 2026

Nice work. High level looks good! Left a few inline comments. I believe worth addressing before merge would be the partial success handling for OTEL specs compliance.

Copy link
Copy Markdown
Contributor

@lquerel lquerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@albertlockett albertlockett added this pull request to the merge queue Feb 20, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Feb 20, 2026
Comment thread rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs Outdated
Comment thread rust/otap-dataflow/crates/otap/src/otlp_http/client_settings.rs Outdated
Comment thread rust/otap-dataflow/crates/otap/src/otlp_http_exporter/mod.rs Outdated
Copy link
Copy Markdown
Contributor

@utpilla utpilla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions.

albertlockett and others added 2 commits February 20, 2026 16:59
Co-authored-by: Utkarsh Umesan Pillai <66651184+utpilla@users.noreply.github.com>
// ensure exit success
result.unwrap();

// validate we received three Nacks
Copy link
Copy Markdown
Contributor

@utpilla utpilla Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be removed/updated for these tests as well:

  • test_handles_connection_refused_errors
  • test_handles_response_body_too_large
  • test_handles_invalid_otap_payloads
  • test_nacks_for_otap_payloads_when_context_indicates_no_payload_return
  • test_nacks_for_otlp_payloads_when_context_indicates_no_payload_return

@albertlockett albertlockett added this pull request to the merge queue Feb 20, 2026
Merged via the queue into open-telemetry:main with commit 332dc5e Feb 20, 2026
60 checks passed
@albertlockett albertlockett deleted the albert/1145 branch February 20, 2026 23:02
github-merge-queue Bot pushed a commit that referenced this pull request Feb 22, 2026
# Change Summary

<!--
Replace with a brief summary of the change in this PR
-->

small followup from
#2070.

Adds new config options for each signal type to override the endpoint to
which the OTLP HTTP exporter sends data.

This is to aid with parity between this implementation and the analogous
Go collector component, which also has these options:
https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/otlphttpexporter#otlp-http-exporter

## What issue does this PR close?

<!--
We highly recommend correlation of every PR to an issue
-->

* Part of #1145 

## How are these changes tested?

A new unit test is added.

## Are there any user-facing changes?

 <!-- If yes, provide further info below -->
 
 Users can configure the component with these new options.
cijothomas pushed a commit to cijothomas/otel-arrow that referenced this pull request Feb 23, 2026
…etry#2082)

# Change Summary

<!--
Replace with a brief summary of the change in this PR
-->

Collects the telemetry returned by the component in the exporter
`TestRuntime`.

A new function is added to manually invoke the telemetry loop one time
on `InternalCollector`. This is called if the exporter returned a
terminal state with some metrics snapshots.

## What issue does this PR close?

<!--
We highly recommend correlation of every PR to an issue
-->

* Closes open-telemetry#2081

## How are these changes tested?

I tested these manually in a unit test I had on a feature branch:

open-telemetry@2722031

More test coverage will be added later on: I plan to use this in a
followup to PR open-telemetry#2070 to
add a comprehensive suite of unit tests to the metrics reported by the
OTLP HTTP exporter

## Are there any user-facing changes?

 <!-- If yes, provide further info below -->
 
 No

---------

Co-authored-by: utpilla <utpilla@users.noreply.github.com>
cijothomas pushed a commit to cijothomas/otel-arrow that referenced this pull request Feb 23, 2026
# Change Summary

<!--
Replace with a brief summary of the change in this PR
-->

This PR adds a new exporter for exporting telemetry via OTLP over HTTP.

Many of the same patterns from the OTLP gRPC exporter are followed,
including the optimizations that were added in
open-telemetry#1474.

This PR adds a new `otlp_http::client_settings` to the otap crate, which
contains `HttpClientSettings`, a reusable configuration for an HTTP
client, which can produce a configured `reqwest::ClientBuilder`. This is
analogous to the `otlp_grpc::client_settings::GrpcClientSettings`.

The exporter uses the `otlp_exporter:: InFlightExports` type for
managing a limited number of concurrent in flight requests and, and
handling request completions.

The exporter uses a pool of `request::Clients`, implemented as the
`HttpClientPool` type. The intention is to force requests to be
distributed over multiple connections to the OTLP server. In our cast,
there may be many servers listening on the same port using SOREUSEPORT
(for example, one per instance of a pipeline, each on different cores).
The intention is to hopefully have better load balancing. When using
HTTP version 1, this is likely unnecessary as with a high enough
concurrent request volume, multiple connections will be opened anyway by
the internal reqwest Client connection pool. However, this will likely
be needed when we add TLS because HTTP version 2 may be negotiated, in
which case multiple requests can be multiplexed over the same
connection, leading to the same issue we had in the OTLP gRPC exporter
(which also uses a client pool).

There are a variety of additional tasks that I will handle in followup
PRs:
- [ ] TLS/mTLS support
- [ ] Proxy support
- [ ] Compression - compressed request bodies / accepting compressed
responses
- [ ] Allow endpoint overrides for each signal type (similar to Go
collector implementation)
- [ ] Unit test metrics produced by component

## What issue does this PR close?

<!--
We highly recommend correlation of every PR to an issue
-->

* Relates to open-telemetry#1145 

## How are these changes tested?

A suite of unit tests covering successful requests both OTLP & OTAP
pdatas, plus tests covering various errors including connection refused,
non 200 response, OTLP service responses w/ partial success and
malformed OTAP batches.

## Are there any user-facing changes?

Users now have a new component available for their telemetry pipelines.

---------

Co-authored-by: Lalit Kumar Bhasin <lalit_fin@yahoo.com>
Co-authored-by: Utkarsh Umesan Pillai <66651184+utpilla@users.noreply.github.com>
cijothomas pushed a commit to cijothomas/otel-arrow that referenced this pull request Feb 23, 2026
…emetry#2089)

# Change Summary

<!--
Replace with a brief summary of the change in this PR
-->

small followup from
open-telemetry#2070.

Adds new config options for each signal type to override the endpoint to
which the OTLP HTTP exporter sends data.

This is to aid with parity between this implementation and the analogous
Go collector component, which also has these options:
https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/otlphttpexporter#otlp-http-exporter

## What issue does this PR close?

<!--
We highly recommend correlation of every PR to an issue
-->

* Part of open-telemetry#1145 

## How are these changes tested?

A new unit test is added.

## Are there any user-facing changes?

 <!-- If yes, provide further info below -->
 
 Users can configure the component with these new options.
github-merge-queue Bot pushed a commit that referenced this pull request Mar 9, 2026
## Summary

Renames the gRPC-based OTLP exporter module and URN to distinguish it
from the newly-added HTTP-based exporter (#2070).

**URN change:** `urn:otel:otlp:exporter` → `urn:otel:otlp_grpc:exporter`
**Module rename:** `otlp_exporter.rs` → `otlp_grpc_exporter.rs`

Fixes #2107

## Changes

### Rust Source (3 files)
- **Renamed** `otlp_exporter.rs` → `otlp_grpc_exporter.rs` and updated
URN constant value
- **Updated** `lib.rs` module declaration: `pub mod otlp_exporter` →
`pub mod otlp_grpc_exporter`
- **Updated** `urn.rs` test case URN reference

### Config Files (8 files)
All in `rust/otap-dataflow/configs/` — replaced `plugin_urn` from
`urn:otel:otlp:exporter` → `urn:otel:otlp_grpc:exporter`

### Perf Test Templates (9 files)
All in
`tools/pipeline_perf_test/test_suites/integration/templates/configs/` —
same URN replacement

### Documentation (3 files)
- `crates/quiver/ARCHITECTURE.md` — updated node names + URN in config
examples
- `docs/self_tracing_architecture.md` — updated node names in config
example
- `docs/telemetry/metrics-guide.md` — updated metric set name

## What Was NOT Changed (by design)
- **Test function names** (e.g. `otlp_exporter_connects_with_mtls`) —
describe behavior, not module path
- **Test file names** (`otlp_exporter_tls.rs`,
`otlp_exporter_proxy_tls.rs`) — no module imports depend on them
- **Telemetry crate** (`otlp_exporter_provider`,
`configure_grpc_otlp_exporter`) — separate OTel SDK, not the pipeline
exporter
- **Constant/struct names** (`OTLP_EXPORTER_URN`, `OTLPExporter`) — kept
per issue scope

## Verification
- ✅ `cargo build --workspace` — passed
- ✅ `cargo test --workspace` — all tests passed, zero failures
- ✅ `grep -r "urn:otel:otlp:exporter"` — zero matches remain

Co-authored-by: Drew Relmas <drewrelmas@gmail.com>
Co-authored-by: albertlockett <a.lockett@f5.com>
cijothomas pushed a commit to cijothomas/otel-arrow that referenced this pull request Mar 11, 2026
## Summary

Renames the gRPC-based OTLP exporter module and URN to distinguish it
from the newly-added HTTP-based exporter (open-telemetry#2070).

**URN change:** `urn:otel:otlp:exporter` → `urn:otel:otlp_grpc:exporter`
**Module rename:** `otlp_exporter.rs` → `otlp_grpc_exporter.rs`

Fixes open-telemetry#2107

## Changes

### Rust Source (3 files)
- **Renamed** `otlp_exporter.rs` → `otlp_grpc_exporter.rs` and updated
URN constant value
- **Updated** `lib.rs` module declaration: `pub mod otlp_exporter` →
`pub mod otlp_grpc_exporter`
- **Updated** `urn.rs` test case URN reference

### Config Files (8 files)
All in `rust/otap-dataflow/configs/` — replaced `plugin_urn` from
`urn:otel:otlp:exporter` → `urn:otel:otlp_grpc:exporter`

### Perf Test Templates (9 files)
All in
`tools/pipeline_perf_test/test_suites/integration/templates/configs/` —
same URN replacement

### Documentation (3 files)
- `crates/quiver/ARCHITECTURE.md` — updated node names + URN in config
examples
- `docs/self_tracing_architecture.md` — updated node names in config
example
- `docs/telemetry/metrics-guide.md` — updated metric set name

## What Was NOT Changed (by design)
- **Test function names** (e.g. `otlp_exporter_connects_with_mtls`) —
describe behavior, not module path
- **Test file names** (`otlp_exporter_tls.rs`,
`otlp_exporter_proxy_tls.rs`) — no module imports depend on them
- **Telemetry crate** (`otlp_exporter_provider`,
`configure_grpc_otlp_exporter`) — separate OTel SDK, not the pipeline
exporter
- **Constant/struct names** (`OTLP_EXPORTER_URN`, `OTLPExporter`) — kept
per issue scope

## Verification
- ✅ `cargo build --workspace` — passed
- ✅ `cargo test --workspace` — all tests passed, zero failures
- ✅ `grep -r "urn:otel:otlp:exporter"` — zero matches remain

Co-authored-by: Drew Relmas <drewrelmas@gmail.com>
Co-authored-by: albertlockett <a.lockett@f5.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants