Skip to content

Fix KeyWatcher#updates(timeout:) to never drop a received entry#176

Open
Xayc73 wants to merge 1 commit intonats-io:mainfrom
Xayc73:fix/kv-popped-with-timeout-fix
Open

Fix KeyWatcher#updates(timeout:) to never drop a received entry#176
Xayc73 wants to merge 1 commit intonats-io:mainfrom
Xayc73:fix/kv-popped-with-timeout-fix

Conversation

@Xayc73
Copy link

@Xayc73 Xayc73 commented Jan 15, 2026

Summary

NATS::KeyWatcher#updates(timeout:) can raise NATS::Timeout after @_updates.pop(timeout:) has already returned an entry, because it wraps the pop call with a post-check timeout helper (MonotonicTime.with_nats_timeout). Under high load / scheduling delays / tiny timeouts, this leads to a “logical drop”: the caller receives a timeout exception and typically ignores that iteration, even though an update was actually dequeued.

This MR changes KeyWatcher#updates so that timeouts are only raised when no value was returned. If an entry was received, it is always returned to the caller.

Expected result

  • No lost updates: if updates(timeout:) returns an entry from the internal queue, it is never discarded via a late timeout exception.
  • Preserved behavior:
    • When there are no more pending updates (the watcher pushes the nil marker), updates returns nil.
    • When waiting for an update exceeds the timeout and no value was produced, updates raises NATS::Timeout.

Root cause analysis

Current implementation:

  • KeyWatcher#updates(timeout:) does:

    • call @_updates.pop(timeout:)
    • then MonotonicTime.with_nats_timeout(timeout) checks elapsed time after the block returns
    • if elapsed time > timeout, it raises NATS::Timeout

This is correct for “deadline-style” APIs, but not for “return-or-timeout” queue reads, because the timeout exception can be raised even after a value has been successfully read.

What changed

  • KeyWatcher#updates no longer wraps @_updates.pop with MonotonicTime.with_nats_timeout.
  • Instead it measures elapsed time around the pop and raises only if:
    • the result is nil, and
    • the call duration exceeded the requested timeout

This ensures that a returned entry always wins over a late timeout.

Why this approach

  • Scope-limited: the fix is only applied to KeyWatcher#updates, not to MonotonicTime.with_nats_timeout, because that helper is used in other places (e.g. request/flush/waits) where its semantics may be relied upon.
  • Behavior-preserving: the existing watch contract around nil marker vs. real timeout remains intact.
  • Prevents data loss in user code: callers commonly treat timeout exceptions as “no update” and retry, which makes late timeouts especially harmful when they happen after a successful dequeue.

Files changed

  • lib/nats/io/kv.rb: adjust KeyWatcher#updates(timeout:) timeout semantics.
  • spec/kv_spec.rb: add regression spec for “entry returned but elapsed time > timeout”.

@Xayc73 Xayc73 changed the title **Fix KeyWatcher#updates(timeout:) to never drop a received entry** Fix KeyWatcher#updates(timeout:) to never drop a received entry Jan 15, 2026
@Xayc73
Copy link
Author

Xayc73 commented Feb 16, 2026

@vankiru Hi! Could you take a look when you have a moment?

I’d like to know the current state of the gem and whether there’s anything we can do to move this MR forward. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant