Skip to content

Conversation

@devjoonn
Copy link

@devjoonn devjoonn commented Dec 5, 2025

The OTLP HTTP Trace Exporter had an issue where it always returned .success regardless of the actual HTTP request result. This prevented callers from detecting failures even when network errors or 4xx/5xx responses occurred.

To fix this, the exporter now waits for the HTTP request to finish and returns the correct result based on the actual response. Additionally, timeout handling has been added to prevent indefinite blocking, ensuring compliance with the OpenTelemetry spec requirement that "Export() MUST NOT block indefinitely, there MUST be a reasonable upper limit after which the call must time out with an error result (Failure)".

The timeout implementation follows the same pattern as OtlpHttpLogExporter for consistency, using min(explicitTimeout ?? infinity, config.timeout) for both request.timeoutInterval and semaphore.wait(). The default timeout is 10 seconds, with optional override via the explicitTimeout parameter.

With this change, the exporter properly returns .failure for network issues or non-2xx responses, and failed spans are safely re-added for retry.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Dec 5, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@devjoonn
Copy link
Author

devjoonn commented Dec 15, 2025

@bryce-b @nachoBonafonte @vvydier @ArielDemarco
CLA has been signed.
Thanks again for taking a look when you have time. 🙂

Comment on lines 73 to 82
var request = createRequest(body: body, endpoint: endpoint)
if let headers = envVarHeaders {
headers.forEach { key, value in
request.addValue(value, forHTTPHeaderField: key)
}
} else if let headers = config.headers {
headers.forEach { key, value in
request.addValue(value, forHTTPHeaderField: key)
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this included?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve confirmed that the headers are already being added during the createRequest stage.
I’ll make a commit with the fix. Thank you for the review!

Copy link
Member

@ArielDemarco ArielDemarco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @devjoonn , thanks for the contribution!

The OTLP HTTP Trace Exporter had an issue where it always returned .success regardless of the actual HTTP request result. This prevented callers from detecting failures even when network errors or 4xx/5xx responses occurred.

I don’t think it's necessarily an error to not propagate the HTTP request result synchronously via the return value.
According to the official spec, what export is required to signal is that the export task (whether synchronous or asynchronous) has completed and not necessarily the outcome of the underlying network request. Quoting the spec directly:

Depending on the implementation the result of the export may be returned to the Processor not in the return value of the call to Export() but in a language specific way for signaling completion of an asynchronous task

Additionally, whether an export ultimately succeeds or fails (and how retries, does exponential backoff, or concurrent requests are handled) is explicitly the responsibility of the exporter, not the consumer (aka. the processor), as described in the same section of the spec.

All that being said, I'm not opposed to this change. In fact, waiting for the async task to complete can be useful to better reflect task completion semantics. However, per the spec, it's important to ensure that this does not block indefinitely. In particular, I think there should be a timeout when waiting on the semaphore to avoid stale exporters. Quoting again the spec:

Export() MUST NOT block indefinitely, there MUST be a reasonable upper limit after which the call must time out with an error result (Failure}

Finally, I'm seeing several test failures, which I assume are related to the behavioral change introduced here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants