-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Component(s)
exporter/otlp_grpc, exporter/otlp_http, receiver/otlp
Is your feature request related to a problem? Please describe.
This request deals with both OTLP over gRPC and OTLP over HTTP.
Loosely relates to #9823.
This is a request to see if there might be a way to enable a standard mechanism for allowing an OTel Collector (or set of) to instruct an OTLP-based Exporter to stop shipping data.
As an example, imagine that you support a service where users from the public internet can ship data to your OTLP endpoint / receiver. Now, imagine that the user begins shipping data to the OTLP receiver, but then later decides they don't want to use the service (for any number of reasons), but they leave their OTLP exporter running.
The OTLP receiver has a few mechanisms that allow it to filter (processors), reject (400, 401, 403, or 404, etc), and even delay (429 with long delays) the incoming traffic, but there is no way to tell the shipper on the other end to shutdown. It is a waste of both side's resources to continue to receive and process defunct data.
Describe the solution you'd like
My proposal is that OTLP Exporters should either honor 404 as a permanent or fatal error, or perhaps some other status code, like:
| HTTP Status Code | Common Name | Thoughts |
|---|---|---|
301 |
Moved Permanently | Supposed to have a followable redirect, so not great. |
402 |
Payment Required | Can be misleading because this should apply for free reasons too. |
405 |
Method Not Allowed | Indicates a very real and insurmountable problem. |
406 |
Not Acceptable | Could be theoretically temporary. |
410 |
Gone | Probably the best one because clients are expected to not try again. |
421 |
Misdirected Request | Can be retried from a "different connection". |
Frankly, I don't have a strong opinion about what HTTP status code is used (I do think 410 seems to fit well), but I do think it would help all hosted OTel services to have a way to say "please stop sending any data" without trying to interact with users.
I do separately wonder if auth failures (especially 401?) should kill a pipeline.
Therefore, I think that standard OTLP receivers should be able to respond with an appropriate status code that tells the associated OTLP exporter to stop shipping it data entirely.
Describe alternatives you've considered
The only realistic software alternative is to require some secondary API call, which means that neither side can use a standard OTel process. Beyond that, you have to ask users to turn off agents manually.
Additional context
There are some drawbacks:
- Services that run the OTLP exporter could be configured to simply restart, but this is not really any different than simply retrying again with the next batch
- Existing services won't honor the response (but as long as it is 4xx, it won't be retried at least)
Every receiver that handles status codes should offer this functionality. This is not an OTLP feature, so much as something that is easy to implement in OTLP because it has standardized support for status codes.
Tip
React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.