feat(cache): add in flight deduping by MasterPtato · Pull Request #4459 · rivet-dev/rivet

MasterPtato · 2026-03-19T20:50:20Z

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

railway-app · 2026-03-19T20:50:35Z

🚅 Deployed to the rivet-pr-4459 environment in rivet-frontend

Service	Status	Web	Updated (UTC)
frontend-cloud	😴 Sleeping (View Logs)	Web	Mar 24, 2026 at 7:53 am
frontend-inspector	😴 Sleeping (View Logs)	Web	Mar 24, 2026 at 1:03 am
website	❌ Build Failed (View Logs)	Web	Mar 19, 2026 at 8:53 pm
mcp-hub	✅ Success (View Logs)	Web	Mar 19, 2026 at 8:52 pm
ladle	❌ Build Failed (View Logs)	Web	Mar 19, 2026 at 8:51 pm

MasterPtato · 2026-03-19T20:50:49Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

pkg-pr-new · 2026-03-19T20:54:08Z

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4459

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4459

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4459

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4459

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4459

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4459

@rivetkit/sqlite-vfs

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sqlite-vfs@4459

@rivetkit/traces

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/traces@4459

@rivetkit/workflow-engine

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/workflow-engine@4459

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4459

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4459

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4459

commit: 6570bf2

claude · 2026-03-19T20:55:20Z

Code Review: feat(cache): add in flight deduping

The core idea of in-flight request deduplication is sound and addresses a real stampede problem. Here are my findings:

Critical Issue 1: Lease not released on getter error. If the getter returns an Err, try_join short-circuits via ?, so the lease release loop never executes. This permanently stalls subsequent requests for those keys (5-second wait per attempt) until the process restarts. A guard/defer pattern would ensure cleanup regardless of the error path.

Critical Issue 2: Broadcast not sent on empty resolution or getter error. broadcast_tx.send is only called when entries_values is non-empty. If the getter resolves no values or errors, waiters are stuck until IN_FLIGHT_TIMEOUT (5 seconds). The broadcast should be sent unconditionally after the getter completes. Combined with issue 1, a getter error causes both a leaked lease and a 5-second stall for all waiters.

Moderate Issue 3: HashMap iteration order creates implicit coupling. In req_config.rs, keys and cache_keys are unzipped from ctx.entries() with non-deterministic HashMap iteration order, then keys is zipped with cached_values from the driver. This works because both were derived from the same iterator in the same pass, but it is fragile. A Vec pairing (Key, RawCacheKey) would make the relationship explicit and safe. The same issue applies in the waiting-keys path (succeeded_keys / succeeded_cache_keys).

Moderate Issue 4: Rate limit tests silently removed. integration.rs contained test_rate_limit_basic and test_rate_limit_ip_isolation. These do not appear in any of the new test files (fetch.rs, in_flight.rs, ttl.rs). If rate limiting is still a feature of this crate, these tests should be preserved.

Minor Issue 5: timeout_falls_back_to_getter test adds 5 seconds to the test suite. The test necessarily waits for IN_FLIGHT_TIMEOUT. Making the timeout configurable via a cfg(test) override or a parameter on CacheInner would allow faster test runs.

Minor Issue 6: Inline await changes latency behavior on cache misses. The cache write was previously done in a background task; now it is awaited inline. This is likely intentional (to ensure broadcast happens after write), but it adds write latency to every cache miss response. A short comment documenting the trade-off would help future readers.

Positive Observations: Switching GetterCtx from Vec to HashMap removes O(n) deduplication and makes key lookups O(1). scc::HashMap is an appropriate choice for concurrent in-flight tracking. Test coverage for the deduplication logic is solid: single waiter, multiple waiters, independent keys, mixed cached/in-flight, and timeout fallback are all covered. Removing the anyhow glob import aligns with the project style guidelines. The Driver::get signature change to a slice of RawCacheKey is a correct ergonomics improvement.

claude · 2026-03-25T00:09:56Z

Good overall approach. Deduplicating concurrent cache misses with a broadcast-based mechanism is the right design. The refactor of GetterCtx from Vec to HashMap is a clean improvement. A few issues need attention before merge.

Critical: Lease not released on getter error

In req_config.rs, if the lease-holder getter fails, tokio::try_join! short-circuits and returns early. The broadcast_tx.send and in_flight.remove_async calls are never reached.

Effect: Waiters hang for the full IN_FLIGHT_TIMEOUT (5 s) before falling back. The in_flight entries are never removed and leak for the lifetime of CacheInner. Every future request for those keys will find a stale entry, subscribe to a dead broadcast channel, and always time out.

Fix: use a RAII guard so cleanup always runs on drop, even on error.

Bug: Broadcast skipped when getter resolves no values

The broadcast send lives inside if !entries_values.is_empty(), but lease removal happens unconditionally after. If the getter resolves nothing (entity not found), entries_values is empty, the broadcast is never sent, but leases are removed. Waiters hang the full 5 s before timing out. Correct result, but severe latency penalty for a common not-found path.

Fix: move broadcast_tx.send outside the if so it fires unconditionally after the write attempt.

Ordering: remove from in_flight before broadcasting

Current order: (1) write to cache, (2) broadcast, (3) remove lease from in_flight. Between steps 2 and 3, a new request can find the stale entry, subscribe to the already-consumed channel, and wait 5 s. Swap the order: remove from in_flight first, then broadcast. New requests after removal will do a fresh cache read (already populated) and return immediately.

Test coverage gaps

The new in_flight.rs tests are excellent. Two additional cases worth adding:

Getter error propagation: lease-holder getter returns Err -- verify in_flight is cleaned up and waiters do not hang indefinitely.
All-miss getter: lease-holder getter resolves no values -- verify waiters receive the broadcast promptly rather than timing out after 5 s.

Minor

broadcast::channel with capacity 16 is created even when leased_keys is empty, and only one message is ever sent. Capacity 1 is sufficient.

Overall: the core mechanism is sound, but the error-path cleanup is a real correctness/leak bug and the broadcast-skip-on-empty is a meaningful latency regression on not-found paths. Both should be fixed before merging.

MasterPtato requested a review from NathanFlurry March 19, 2026 20:50

railway-app bot had a problem deploying to rivet-frontend / rivet-pr-4459 March 19, 2026 20:50 Failure

MasterPtato force-pushed the 03-18-fix_cache_clean_up_lib branch from 175706a to 17d21f5 Compare March 19, 2026 20:53

MasterPtato force-pushed the 03-19-feat_cache_add_in_flight_deduping branch from 75e1e38 to bc5d6d3 Compare March 19, 2026 20:53

MasterPtato force-pushed the 03-19-feat_cache_add_in_flight_deduping branch from bc5d6d3 to b395fae Compare March 21, 2026 01:55

MasterPtato force-pushed the 03-18-fix_cache_clean_up_lib branch from 17d21f5 to de18421 Compare March 21, 2026 01:55

MasterPtato mentioned this pull request Mar 21, 2026

feat(pb): actors v3 #4463

Draft

11 tasks

MasterPtato force-pushed the 03-18-fix_cache_clean_up_lib branch from de18421 to e65f84d Compare March 24, 2026 00:30

MasterPtato force-pushed the 03-19-feat_cache_add_in_flight_deduping branch from b395fae to 6570bf2 Compare March 24, 2026 00:30

MasterPtato mentioned this pull request Mar 24, 2026

feat(ups): implement queue subs #4486

Draft

11 tasks

MasterPtato force-pushed the 03-19-feat_cache_add_in_flight_deduping branch from 6570bf2 to b46226c Compare March 24, 2026 00:36

MasterPtato force-pushed the 03-18-fix_cache_clean_up_lib branch 2 times, most recently from 65200cf to ea32d90 Compare March 25, 2026 00:05

MasterPtato force-pushed the 03-19-feat_cache_add_in_flight_deduping branch from b46226c to 66ec30f Compare March 25, 2026 00:05

MasterPtato force-pushed the 03-19-feat_cache_add_in_flight_deduping branch from 66ec30f to 97b9cfd Compare March 26, 2026 01:18

MasterPtato force-pushed the 03-18-fix_cache_clean_up_lib branch from ea32d90 to ddfa969 Compare March 26, 2026 01:18

feat(cache): add in flight deduping

3fc4f7f

MasterPtato force-pushed the 03-18-fix_cache_clean_up_lib branch from ddfa969 to bed6ca4 Compare March 26, 2026 20:50

MasterPtato force-pushed the 03-19-feat_cache_add_in_flight_deduping branch from 97b9cfd to 3fc4f7f Compare March 26, 2026 20:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cache): add in flight deduping#4459

feat(cache): add in flight deduping#4459
MasterPtato wants to merge 1 commit into03-18-fix_cache_clean_up_libfrom
03-19-feat_cache_add_in_flight_deduping

MasterPtato commented Mar 19, 2026

Uh oh!

railway-app bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

MasterPtato commented Mar 19, 2026 •

edited

Loading

Uh oh!

pkg-pr-new bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MasterPtato commented Mar 19, 2026

Description

Type of change

How Has This Been Tested?

Checklist:

Uh oh!

railway-app bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MasterPtato commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkg-pr-new bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: feat(cache): add in flight deduping

Uh oh!

claude bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

railway-app bot commented Mar 19, 2026 •

edited

Loading

MasterPtato commented Mar 19, 2026 •

edited

Loading

pkg-pr-new bot commented Mar 19, 2026 •

edited

Loading

claude bot commented Mar 19, 2026 •

edited

Loading

claude bot commented Mar 25, 2026 •

edited

Loading