Commit 22464ce
committed
Replace Khepri topic routing projection with trie + ordered_set (v4)
Resolves #15588.
The previous Khepri topic routing projection (v3) stored topic bindings
as sets:set(#binding{}) inside trie leaf nodes. This design had a
major performance drawback:
On the insertion/deletion path (in the single Khepri Ra process),
every binding change required a read-modify-write of the entire
sets:set(), making it O(N) in the number of bindings at that leaf.
With many MQTT clients connecting concurrently (each subscribing to
the same topic filter), this made the Ra process a bottleneck.
Another less severe performance issue was that the entire binding
was being copied including the binding arguments containing the MQTT 5.0
subscription options such as:
```
{<<"x-mqtt-subscription-opts">>,table,
[{<<"id">>,unsignedint,1},
{<<"no-local">>,bool,false},
{<<"qos">>,unsignedbyte,0},
{<<"retain-as-published">>,bool,false},
{<<"retain-handling">>,unsignedbyte,0}]}]
```
Replace the single ETS projection table with two purpose-built tables:
1. Trie edges table (ETS set, read_concurrency=true):
- Row: `{{XSrc, ParentNodeId, Word}, ChildNodeId, ChildCount}`
- XSrc = {VHost, ExchangeName} (compact 2-tuple of binaries)
- NodeId = root | reference()
- ChildCount tracks outgoing edges for garbage collection
2. Leaf bindings table (ETS ordered_set, read_concurrency=true):
- Key: {NodeId, BindingKey, Dest}
- Stored as 1-tuples: {{NodeId, BindingKey, Dest}}
- No value column; all data is in the key to minimize copying
The trie structure preserves O(depth * 3) routing complexity regardless
of the number of overall bindings or wildcard filters. At each trie level, we
probe at most 3 edges (literal word, <<"*">>, <<"#">>), each via
ets:lookup_element/4 which copies only the ChildNodeId (a reference).
The ordered_set for bindings provides:
- O(log N) insert and delete per binding (no read-modify-write)
- The binding key (needed for MQTT subscription identifiers and topic
aliases) is part of the key, so it is returned directly during
destination collection without additional lookups
Collecting destinations at a matched trie leaf uses a hybrid strategy:
- Fanout 0-2 (the common case: unicast, device + stream): up to 3
ets:next/2 probes. Each ets:next/2 call costs O(log N) because the
CATree (used with read_concurrency) allocates a fresh tree traversal
stack on each call.
- Fanout > 2: ets:select/2 with a partially bound key does an O(log N)
seek followed by an O(F) range scan. The match spec compilation
overhead amortises over the larger result set.
ets:lookup_element/4 (OTP 26+) returns a default value on miss
instead of throwing badarg, and copies only the requested element
on hit. This avoids both exception overhead (misses are common during
trie traversal of <<"*">> and <<"#">> branches) and unnecessary data
copying (we only need the ChildNodeId, not the full row).
Trie node IDs are ephemeral (the tables are rebuilt when the Khepri
projection is re-registered). make_ref() is fast, globally unique
within a node, and has good hash distribution for the ETS set table.
When a binding is deleted, the trie path from root to leaf is collected
in a single downward walk (trie_follow_down_get_path). Empty nodes are
then pruned bottom-up: a node is empty when its ChildCount is 0 and it
has no bindings in the ordered_set table.
Benchmarks below were run with 500K routing operations per scenario
(on the same machine, back-to-back between main (v3) and this commit.
Significant insert/delete improvements:
Churn insert (8K bindings, 4 filters/client): ~1,120 vs ~810 ops/s (+38%)
v3 did a read-modify-write of sets:set() per binding; v4 does
a single ets:insert into the ordered_set plus trie edge updates.
MQTT device insert (20K bindings): ~650 vs ~420 ops/s (+55%)
Same mechanism as churn insert. Particularly impactful when many
clients share the same wildcard filter (e.g. "broadcast.#"),
since v3's sets:set() grew with each client while v4 inserts
are O(log N) regardless.
Same-key fanout insert (10K): ~415 vs ~290 ops/s (+43%)
The worst case for v3: all 10K bindings share the same key,
so each insert copies and rebuilds the growing sets:set().
Routing improvements:
MQTT unicast (10K devices, 20K bindings): ~460K vs ~250K ops/s (+80%)
Each route matches 1 queue among 10K unique exact keys plus
10K queues sharing "broadcast.#". v3 stored bindings in the same
ETS row as the trie edge, so every trie lookup copied the entire
sets:set(). v4 separates trie edges (small rows, set table) from
bindings (ordered_set), so the trie walk copies only references.
Large fanout (10K queues, same key): ~3,100 vs ~1,170 ops/s (+165%)
v3 copied a 10K-element sets:set() out of ETS in a single
ets:lookup, then called sets:to_list/1. v4 uses ets:select/2
with a partially bound key, which does an O(log N) seek and
then an efficient O(F) range scan without intermediate set
conversion.
MQTT broadcast (10K fanout): ~0.6 vs ~0.9 ms/route (+50%)
Same mechanism as above.
Scenarios with no significant change (within benchmark noise):
Exact match, wildcard *, wildcard #, mixed wildcards, and many
wildcard filters showed no clear difference. Both v3 and v4 use
a trie walk, so routing speed is comparable when the fanout is
small and the bottleneck is trie traversal rather than destination
collection.1 parent 2bd6a02 commit 22464ce
File tree
6 files changed
+1112
-495
lines changed- deps/rabbit
- include
- src
- test
6 files changed
+1112
-495
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
59 | | - | |
| 59 | + | |
60 | 60 | | |
61 | | - | |
| 61 | + | |
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
635 | 635 | | |
636 | 636 | | |
637 | 637 | | |
638 | | - | |
| 638 | + | |
639 | 639 | | |
640 | 640 | | |
641 | | - | |
642 | | - | |
| 641 | + | |
| 642 | + | |
643 | 643 | | |
644 | 644 | | |
645 | 645 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
| 15 | + | |
| 16 | + | |
19 | 17 | | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
| 18 | + | |
| 19 | + | |
24 | 20 | | |
25 | 21 | | |
26 | 22 | | |
27 | 23 | | |
28 | 24 | | |
29 | 25 | | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | | - | |
37 | | - | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | | - | |
49 | | - | |
50 | | - | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | | - | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | 26 | | |
74 | 27 | | |
75 | 28 | | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | | - | |
80 | | - | |
81 | | - | |
82 | | - | |
83 | | - | |
| 29 | + | |
84 | 30 | | |
85 | 31 | | |
86 | | - | |
87 | | - | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | | - | |
100 | | - | |
101 | | - | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
123 | 38 | | |
124 | 39 | | |
125 | 40 | | |
126 | 41 | | |
127 | | - | |
128 | 42 | | |
129 | 43 | | |
130 | 44 | | |
| |||
139 | 53 | | |
140 | 54 | | |
141 | 55 | | |
142 | | - | |
143 | | - | |
144 | | - | |
145 | | - | |
146 | | - | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
167 | | - | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
175 | 61 | | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
196 | 70 | | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
212 | 114 | | |
213 | 115 | | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
230 | 151 | | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
0 commit comments