feat(ups): implement queue subs#4486
feat(ups): implement queue subs#4486MasterPtato wants to merge 1 commit into03-19-feat_cache_add_in_flight_dedupingfrom
Conversation
|
🚅 Deployed to the rivet-pr-4486 environment in rivet-frontend
|
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
ec32bdf to
ae0f886
Compare
6570bf2 to
b46226c
Compare
PR Review:
|
|
Code Review summary for feat(ups): implement queue subs -- The implementation is well-structured and the test coverage is solid. Two medium issues to address before merge: (1) retain_sync on scc::HashMap inside the memory driver GC task runs synchronously inside an async closure and can block the executor; (2) The SELECT and INSERT+NOTIFY in publish_to_queues are not wrapped in a transaction, creating a race where a message can be orphaned for up to 1 hour if the subscriber dies between the two operations. Minor items: one remaining anyhow! macro call in publish() should use .context() instead; the queue parameter is missing from the tracing span in queue_subscribe (fields(%subject) should also include queue); and the Cargo.lock version downgrade from 2.1.7 to 2.1.6-rc.1 appears to be a stacked-PR artifact. On test coverage: test_queue_subscribe_load_balance only checks total message count, not per-subscriber distribution -- this is the right choice to avoid flakiness but worth noting in a comment. Also missing a test for Postgres reconnection behavior (does not drain pending ups_queue_messages on reconnect). |
PR Review: fix(ups): implement queue subsGood implementation of queue subscription semantics across all three drivers (memory, NATS, Postgres). The core design is sound, particularly the use of IssuesMemory driver: silent message loss on dead channels (medium) In the // Instead of silently dropping on error, try each subscriber until one succeeds:
let mut rng = rand::thread_rng();
let mut indices: Vec<usize> = (0..subs.len()).collect();
indices.shuffle(&mut rng);
for i in indices {
if subs[i].send(payload.to_vec()).is_ok() {
break;
}
}The GC eventually prunes dead channels, but in the window between GC runs, messages to dead subscribers are silently dropped. Subscriber count metric undercounts queue subscribers (minor)
Unclaimed messages not redelivered (design note) After GC task errors (minor) Verify that errors in the background GC task (deleting messages older than Positive observations
|

Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context.
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Checklist: