rpc: replace _outstanding unordered_map with log_indexed_container#3450
rpc: replace _outstanding unordered_map with log_indexed_container#3450emdroid wants to merge 1 commit into
_outstanding unordered_map with log_indexed_container#3450Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR replaces the outstanding-reply handler storage from a hash map to a deque-backed, slot-indexed table to reduce large contiguous allocations during rehash and streamline handler lookup/removal.
Changes:
- Introduces
reply_table(deque-based) to store in-flight reply handlers by message id. - Updates client cancel/timeout/reply paths to use
get()/insert()/remove()instead ofunordered_mapoperations.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/rpc/rpc.cc | Switches outstanding handler operations to the new reply_table API. |
| include/seastar/rpc/rpc.hh | Implements reply_table using std::deque and replaces the _outstanding map. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
372f9f8 to
d600495
Compare
| // ever exceeds a page regardless of in-flight count. | ||
| // Note: no_wait verbs consume a message id without registering a slot; | ||
| // those ids produce an out-of-range lookup on reply, handled gracefully. | ||
| struct reply_table { |
There was a problem hiding this comment.
We used the same idea recently here scylladb/scylladb#29794. Maybe extract a template data structure for reuse? The log_indexed_container already has unit tests and also handles the case when insert index is smaller than base_index.
There was a problem hiding this comment.
Right, that would make sense, only we'd need to move (copy first) the log_indexed_container to Seastar. Btw. there is also the idea of "chunked map" #2133 - which this might eventually evolve into.
There was a problem hiding this comment.
Is this PR urgent? There are some comments in scylladb/scylladb#29794 which I need to address, and IIUC I must do it before we merge this PR. I'm not sure when I'll do it, though, as I have some higher-priority work (mostly reviews) and I'm going on vacation on Thursday.
There was a problem hiding this comment.
It's P3 and not marked as CI stability, so not really urgent.
|
|
||
| private: | ||
| std::deque<std::unique_ptr<reply_handler_base>> _slots; | ||
| id_type _base_id = 1; // id corresponding to _slots[0] |
There was a problem hiding this comment.
id_type _base_id = 1; This seems strange, why not zero?
There was a problem hiding this comment.
_message_id starts at 1 in client, so the first id inserted is 1, meaning slot 0 would always be empty if _base_id = 0. Starting at 1 avoids that wasted slot. Which is what Claude inferred when working at the fix - but starting at 0 doesn't make a big difference I guess, or at least comment could be added with the explanation. But especially if we'd create a shared implementation, starting at 0 would be better indeed
| // ids are monotonically increasing and always >= _base_id; | ||
| // a negative index means the caller is violating this invariant. | ||
| SEASTAR_ASSERT(idx >= 0); | ||
| while (static_cast<id_type>(_slots.size()) <= idx) { |
There was a problem hiding this comment.
batch-insert at the end is likely more efficient _slots.insert(_slots.end(), count, nullptr)
| } | ||
|
|
||
| // Pop null entries from the front, advancing _base_id. | ||
| void trim() noexcept { |
There was a problem hiding this comment.
This can be made private? There can't be null slots at the front after remove-s
| cancel->cancel_wait = [this, id] { | ||
| _outstanding[id]->cancel(); | ||
| _outstanding.erase(id); | ||
| if (auto* h = _outstanding.get(id)) { |
There was a problem hiding this comment.
why relax the assumptions? previously it was guaranteed that id exists
| _stats.timeout++; | ||
| _outstanding[id]->timeout(); | ||
| _outstanding.erase(id); | ||
| if (auto* h = _outstanding.get(id)) { |
There was a problem hiding this comment.
why relax the assumptions? previously it was guaranteed that id exists
d600495 to
23ea09d
Compare
_outstanding unordered_map with slot-indexed deque_outstanding unordered_map with log_indexed_container
|
New version:
|
std::unordered_map<id_type, unique_ptr<reply_handler_base>> rehashes
with a single large contiguous allocation when the number of in-flight
requests crosses a bucket threshold. At high concurrency (e.g. 1000
concurrent ALL-consistency reads) this produces >128 KiB allocations,
triggering Seastar's large-alloc warning.
Introduce a new generic utility, log_indexed_container<T, IndexType>
(include/seastar/util/log_indexed_container.hh), which maps monotonically
increasing integer indices to values without hashing or large contiguous
allocations:
- Backed by boost::container::deque, which releases front blocks on
pop_front (unlike std::deque in libstdc++), keeping memory bounded.
- Stores values in direct index-offset slots, so lookup is O(1).
- Trims empty slots from the front after each removal, keeping the
deque short relative to the concurrency window.
- Supports any nullable type (raw pointer, unique_ptr) natively, and
wraps non-nullable types in std::optional internally.
- API: emplace(idx, value), find(idx), take(idx), clear_at(idx),
clear(), size(), empty(), base_index().
Replace _outstanding with a log_indexed_container specialised on
reply_handler_base unique_ptr. RPC message ids are already monotonically
increasing per-connection, making them a natural key. Outstanding reply
handlers are removed via take(). The timer and cancellation callbacks
preserve the original invariant that the slot always exists when they
fire: the receive path disarms both atomically (no yield point) before
returning to the reactor, so a missing slot indicates a bug.
Also add tests/unit/log_indexed_container_test.cc with 13 unit tests
covering emplace, find, take, clear_at, trim_front, base_index
advancement, and a custom index type.
Fixes: SCYLLADB-1541
23ea09d to
0ffa7da
Compare
There was a problem hiding this comment.
IMO the changes should be separated. 1st commit should introduce the new container, the second replace the unordered_map.
std::unordered_map<id_type, unique_ptr<reply_handler_base>>rehashes with a single large contiguous allocation when the number of in-flight requests crosses a bucket threshold. At high concurrency (e.g. 1000 concurrent ALL-consistency reads) this produces >128 KiB allocations, triggering Seastar's large-alloc warning.Introduce a new generic utility,
log_indexed_container<T, IndexType>(include/seastar/util/log_indexed_container.hh), which maps monotonically increasing integer indices to values without hashing or large contiguous allocations:boost::container::deque, which releases front blocks onpop_front(unlikestd::dequein libstdc++), keeping memory bounded.unique_ptr) natively, and wraps non-nullable types instd::optionalinternally.emplace(idx, value),find(idx),take(idx),clear_at(idx),clear(),size(),empty(),base_index().Replace
_outstandingwith alog_indexed_containerspecialised onreply_handler_baseunique_ptr. RPC message ids are already monotonically increasing per-connection, making them a natural key. Outstanding reply handlers are removed viatake(). The timer and cancellation callbacks preserve the original invariant that the slot always exists when they fire: the receive path disarms both atomically (no yield point) before returning to the reactor, so a missing slot indicates a bug.Also add
tests/unit/log_indexed_container_test.ccwith 13 unit tests coveringemplace,find,take,clear_at,trim_front,base_indexadvancement, and a custom index type.Fixes: SCYLLADB-1541