Skip to content

fix: handle redelivered messages in map stream#3418

Closed
vaibhavtiwari33 wants to merge 20 commits into
mainfrom
map-stream-loss
Closed

fix: handle redelivered messages in map stream#3418
vaibhavtiwari33 wants to merge 20 commits into
mainfrom
map-stream-loss

Conversation

@vaibhavtiwari33

@vaibhavtiwari33 vaibhavtiwari33 commented May 8, 2026

Copy link
Copy Markdown
Contributor

What this PR does / why we need it

Changes:

  • Handle messages that have been redelivered, with same offset, in map-stream
  • Return early after sending error on channel, in stream method, when mapper is closed/duplicate ID exists in senders map.

Related issues

Fixes: #3423

Testing

  • Updated the unit tests
  • Tested using the validation platform

Special notes for reviewers

Anything notable for review (risk, rollout, follow-ups).

Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
@vaibhavtiwari33 vaibhavtiwari33 marked this pull request as ready for review May 8, 2026 13:45
@codecov

codecov Bot commented May 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 99.11504% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 82.75%. Comparing base (82e480a) to head (e6f08c6).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
rust/numaflow-core/src/mapper/map/stream.rs 99.11% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3418      +/-   ##
==========================================
+ Coverage   82.65%   82.75%   +0.10%     
==========================================
  Files         307      307              
  Lines       76082    76165      +83     
==========================================
+ Hits        62882    63027     +145     
+ Misses      12645    12580      -65     
- Partials      555      558       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vaibhavtiwari33 vaibhavtiwari33 self-assigned this May 8, 2026
@vaibhavtiwari33 vaibhavtiwari33 added bug Something isn't working area/udf User defined functions dataplane Rust Implementation of Data Plane labels May 8, 2026
@vaibhavtiwari33 vaibhavtiwari33 marked this pull request as draft May 9, 2026 01:03
…n cancellation token in execute loop

Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
…message

Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
@vaibhavtiwari33 vaibhavtiwari33 changed the title fix: send explicit eot from stream receiver task to stream execute task fix: handle redelivered messages in map stream May 12, 2026
@vaibhavtiwari33 vaibhavtiwari33 marked this pull request as ready for review May 12, 2026 03:21
Comment on lines +188 to +190
_ = self.shared_ctx.hard_shutdown_token.cancelled() => {
let e = Error::Mapper("Map Stream cancelled, current message will be nacked".into());
warn!(parent_id = ?self.msg_handle.message.id, ?e, "failed to map message");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If downstream is handling the cancellation, we don't need a check here.

Comment on lines +355 to +362
match senders_guard.map.entry(key.clone()) {
Entry::Occupied(_) => Some(format!(
"duplicate in-flight request id: {key}; refusing to overwrite existing sender"
)),
Entry::Vacant(slot) => {
slot.insert(tx.clone());
None
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this check in unary as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yes, but the case for unary and batch is interesting because we're saved by the fact that they use oneshot sender to get their responses back from the receiver task.

So, even if this sender overwriting behaviour happens, one of the unary/batch tasks will still receive the full response.

Correct fix would be to handle deduplication in the reader/source

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to handle dedup at the source/reader? because once you have that you will have to remove this logic from the map stream.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to remove this defensive check in the map stream irrespective of whether we have the handling present in the reader/source.

I can create a separate PR for reader/source

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vaibhavtiwari33 vaibhavtiwari33 requested a review from yhl25 May 12, 2026 15:52
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vaibhav Tiwari <vaibhav.tiwari33@gmail.com>
Signed-off-by: Vigith Maurice <vigith@gmail.com>
@vaibhavtiwari33

Copy link
Copy Markdown
Contributor Author

redelivered messages are being handled in reader/source as part of this change: github.com//pull/3429

Will close this PR.

@vigith vigith deleted the map-stream-loss branch May 13, 2026 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/udf User defined functions bug Something isn't working dataplane Rust Implementation of Data Plane

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Redelivered message in map stream can lead to partial output

3 participants