Clean up in-flight dedup entries when the request owner is dropped#6577
Draft
ndr-ds wants to merge 1 commit into
Draft
Clean up in-flight dedup entries when the request owner is dropped#6577ndr-ds wants to merge 1 commit into
ndr-ds wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The client's request scheduler deduplicates concurrent requests for the same
data (
RequestsScheduler::deduplicated_request): the first caller becomes theowner and executes the request, while later callers subscribe and wait on a
broadcast channel for the result.
InFlightTrackerholds the map of in-flightentries; only
complete_and_broadcastever removed an entry.This is not cancel-safe. If the owning task is dropped before it completes —
which happens routinely:
synchronize_up_torunsdownload_certificates_fromagainst every validator under
communicate_with_quorum, and once a quorumresponds the remaining (losing) futures are dropped mid-flight — the entry is
never removed. Its broadcast sender stays alive, so every subscriber blocks
forever on
receiver.recv().await(there is no timeout on that wait). Asubscriber living in a different task tree (e.g. the spawned download pipeline)
then starves the whole sync: the chain never advances, the client sits at 0%
CPU with no pending timers, and only a process restart recovers it.
The race is timing-dependent (the subscription join window is
max_request_ttl, 200 ms by default), so it is latent under today's requesttiming. It is reliably reproducible under workloads that line the cert-download
tasks up in lockstep — see Test Plan.
Proposal
Make the in-flight tracker cancel-safe with an ownership guard:
insert_newnow returns anInFlightGuardthat owns the entry. The ownerholds it across the cancellation-prone
try_staggered_parallel().await.InFlightGuard::complete_and_broadcastbroadcasts the resultand removes the entry (unchanged success path).
cancelled), its
Dropremoves the entry. Dropping the entry drops thebroadcast sender, so subscribers wake with
RecvError::Closedand fallthrough to executing the request themselves — a path the code already had but
could never reach while the entry leaked.
generation(from anAtomicU64) means a guard only ever removesits own entry, never a newer owner's entry that reused the same key after
this one went stale and was replaced. Both
complete_and_broadcastandDropare generation-checked, making them idempotent and race-safe.
To let
Dropclean up synchronously, the entries map moves fromtokio::sync::RwLocktostd::sync::Mutex; it is never held across an.await(the per-entry
alternative_peerslist is cloned out from under the lock beforeawaiting). This matches the existing
std::sync::Mutexusage and discipline inlinera-core.Client-side only; no protocol, storage, or validator change.
Test Plan
New unit regression test
test_owner_drop_wakes_subscribers: an ownerregisters an in-flight request, a second caller subscribes, the owner guard is
dropped without completing — the subscriber observes
RecvError::Closed(rather than hanging) and the entry is gone.
cargo test -p linera-core --lib requests_scheduler→ 28 passed;cargo clippy --all-targetsclean.End-to-end, real workload. Reproduced the hang with a fresh-wallet,
empty-DB cold sync of a blob-heavy testnet_conway market chain, and measured a
client built with vs. without this fix, interleaved under identical network
conditions (12 cold syncs each, per-run hang watchdog):
Every time the unfixed build hung, the very next fixed run on the same
conditions completed in ~12 s. Stacks/logs from the hung runs confirm the
root cause above (subscriber parked on
recv().await, leaked entry, nopending timer).
Release Plan
testnetbranch (a companionbackport PR targets
testnet_conway). Client-side only, so it ships with theusual client/SDK release — no validator hotfix.
Links
DownloadBlobscallerchange (Use streaming DownloadBlobs RPC for batch blob downloads #6157 / [testnet] Use streaming DownloadBlobs RPC for batch blob downloads #6476), which reorders request timing enough to make this
latent race fire on ~42% of cold syncs.