-
Notifications
You must be signed in to change notification settings - Fork 1.9k
receiver: Add partial success support for receivers #14445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
receiver: Add partial success support for receivers #14445
Conversation
Signed-off-by: Sanchit2662 <sanchit2662@gmail.com>
|
|
|
Hi @jade-guiton-dd , |
|
Hey @Sanchit2662 Thanks for working on this. I think we still need to get some feedback on the design before moving forward with the implementation. Can you leave your |
jade-guiton-dd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking until a single design has been agreed upon by all relevant parties
|
Hi @Sanchit2662, thanks for your PR. I would be interested in knowing whether you have used a generative AI tool such as ChatGPT to create this PR. Could you clarify if you have done so, and, if so, would you mind providing details as to what was your involvement/review of the tool output? |
|
Hi @dashpole @jade-guiton-dd , |
|
Hi @mx-psi, thanks for asking. Yes, I did use a generative AI tool as a supporting aid to help think through the implementation approach and explore possible design options. The actual code, API shape, and integration with the existing OpenTelemetry Collector components were implemented and reviewed by me.All suggestions were manually evaluated, adapted to the project’s conventions, and validated through local testing and review. I’m happy to clarify or walk through any part of the implementation if that would be helpful. |
mx-psi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the answer. I encourage you to read the Generative AI contribution policy which is available here: https://github.com/open-telemetry/community/blob/main/policies/genai.md before moving forward with this PR.
As @jade-guiton-dd said, this needs discussion before moving forward to get to an agreement, since there are multiple PRs trying to solve the same issue in different ways.
If you want to participate in such discussion, check the Generative AI contribution policy carefully and please do not copy/paste LLM output directly: review it, summarize it, discard anything that is unnecessary or superfluous or correct anything that is wrong.
Summary
This PR adds support for partial success reporting in receivers by allowing them to explicitly report how many items in a batch failed, instead of treating any error as a full batch failure.
A new
consumererror.PartialErroris introduced to carry failed item counts, andreceiverhelperis updated to correctly compute accepted, failed, and refused metrics when partial failures occur.This fixes the current all-or-nothing behavior in receiver self-observability and enables accurate metrics for receivers such as Prometheus, where translation or parsing failures may affect only part of a batch.
Closes: #14440
Impact
Incorrect observability metrics for receivers that partially succeed
Makes it difficult to distinguish:
Forces receivers to invent custom metrics instead of using standard collector self-observability
Fix
1. New API:
consumererror.PartialErrorA new error type is introduced to represent partial failures by carrying a failed item count:
It composes cleanly with existing errors, including downstream rejections:
The failed count can be extracted safely from the error chain:
2. Updated receiverhelper Behavior
receiverhelper.endOp()is updated to detectPartialErrorand correctly compute metrics:Before (all-or-nothing):
After (partial success supported):
This preserves existing behavior for receivers that do not use
PartialError.Result
receiver_accepted_*now reflects the number of successfully processed itemsreceiver_failed_*reports internal receiver failures accuratelyreceiver_refused_*correctly tracks downstream rejectionsThis enables receivers (notably the Prometheus receiver) to report accurate self-observability metrics for partial success scenarios using standard Collector mechanisms.