Commit 21bc549
authored
[rust/rqd] Add OOM prevention logic with kill frame selection (#2064)
Implement a new OOM (Out-Of-Memory) prevention module that proactively
kills frames when memory usage exceeds configured thresholds, preventing
system-wide OOM killer from terminating the RQD process.
When memory usage exceeds `memory_oom_margin_percentage` (default 96%),
the system calculates a target memory level (5% below the threshold) and
selects frames to kill to reach that target safely.
Frame Selection Algorithm -------------------------
Frames are sorted using a multi-criteria scoring system that balances
three factors:
1. **Memory Impact (weight: 10x)**
- Measures absolute memory savings: consumed_memory /
total_memory_consumed
- Prioritizes frames that will free the most system memory - Normalized
to [0,1] range by dividing by the maximum memory impact
3. **Overboard Rate (weight: 7x)**
- Measures relative excess: (consumed - limit) / limit
- Targets frames most aggressively exceeding their soft limits
- A frame using 10GB with a 1GB limit (900% over) has higher overboard
rate than one using 60GB with a 50GB limit (20% over) - Normalized to
[0,1] range by dividing by the maximum overboard rate
5. **Duration Rate (weight: 12x - highest)**
- Prefers killing more recent frames: (max_duration - frame_duration) /
max_duration
- Minimizes wasted compute by preserving long-running frames
- Inverted scale: newer frames score higher - Normalized to [0,1] range
Each frame receives a composite score calculated as: score =
(memory_impact × 10) + (overboard_rate × 7) + (duration_rate × 12)
All metrics are normalized before weighting to ensure each factor
contributes meaningfully regardless of absolute values. This prevents a
frame with very high memory usage from completely dominating the score,
allowing the algorithm to balance all three criteria.
The algorithm sorts frames by descending score and kills them
iteratively until enough memory is freed to reach the target level. This
conservative approach avoids unnecessary termination while effectively
preventing OOM.
The weighting scheme reflects production priorities:
- Duration (12x): Preserve investment in long-running frames
- Memory Impact (10x): Maximize immediate memory relief
- Overboard Rate (7x): Discourage limit violations
Implementation includes comprehensive test coverage validating:
- Threshold triggering behavior
- Target memory calculation
- Single and multi-frame selection
- Each scoring criterion independently
- Normalized scoring with edge cases
- Stop-when-sufficient-memory-freed logic1 parent 0bf7e2c commit 21bc549
File tree
5 files changed
+829
-44
lines changed- rust/crates/rqd/src
- config
- frame
- system
5 files changed
+829
-44
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
92 | 93 | | |
93 | 94 | | |
94 | 95 | | |
| |||
111 | 112 | | |
112 | 113 | | |
113 | 114 | | |
| 115 | + | |
114 | 116 | | |
115 | 117 | | |
116 | 118 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
179 | 180 | | |
180 | 181 | | |
181 | 182 | | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
182 | 228 | | |
183 | 229 | | |
184 | 230 | | |
185 | 231 | | |
186 | 232 | | |
187 | 233 | | |
188 | 234 | | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
189 | 246 | | |
190 | 247 | | |
191 | 248 | | |
| |||
250 | 307 | | |
251 | 308 | | |
252 | 309 | | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
253 | 320 | | |
254 | 321 | | |
255 | 322 | | |
256 | 323 | | |
257 | 324 | | |
258 | 325 | | |
259 | | - | |
| 326 | + | |
260 | 327 | | |
261 | 328 | | |
262 | 329 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
4 | 9 | | |
5 | 10 | | |
6 | 11 | | |
| |||
61 | 66 | | |
62 | 67 | | |
63 | 68 | | |
64 | | - | |
| 69 | + | |
65 | 70 | | |
66 | 71 | | |
67 | 72 | | |
| |||
136 | 141 | | |
137 | 142 | | |
138 | 143 | | |
139 | | - | |
| 144 | + | |
140 | 145 | | |
141 | 146 | | |
142 | 147 | | |
| |||
162 | 167 | | |
163 | 168 | | |
164 | 169 | | |
165 | | - | |
| 170 | + | |
166 | 171 | | |
167 | 172 | | |
168 | 173 | | |
| |||
203 | 208 | | |
204 | 209 | | |
205 | 210 | | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
| 211 | + | |
240 | 212 | | |
241 | 213 | | |
242 | 214 | | |
| |||
245 | 217 | | |
246 | 218 | | |
247 | 219 | | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
248 | 271 | | |
249 | 272 | | |
250 | 273 | | |
| |||
306 | 329 | | |
307 | 330 | | |
308 | 331 | | |
| 332 | + | |
309 | 333 | | |
310 | 334 | | |
311 | 335 | | |
| |||
340 | 364 | | |
341 | 365 | | |
342 | 366 | | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
343 | 374 | | |
344 | 375 | | |
345 | 376 | | |
| |||
361 | 392 | | |
362 | 393 | | |
363 | 394 | | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
364 | 415 | | |
365 | 416 | | |
366 | 417 | | |
| |||
387 | 438 | | |
388 | 439 | | |
389 | 440 | | |
390 | | - | |
| 441 | + | |
391 | 442 | | |
392 | 443 | | |
393 | 444 | | |
| |||
487 | 538 | | |
488 | 539 | | |
489 | 540 | | |
| 541 | + | |
490 | 542 | | |
491 | 543 | | |
492 | 544 | | |
| |||
591 | 643 | | |
592 | 644 | | |
593 | 645 | | |
594 | | - | |
| 646 | + | |
595 | 647 | | |
596 | 648 | | |
597 | 649 | | |
598 | 650 | | |
599 | 651 | | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
600 | 660 | | |
601 | 661 | | |
602 | | - | |
| 662 | + | |
603 | 663 | | |
604 | 664 | | |
605 | 665 | | |
| |||
634 | 694 | | |
635 | 695 | | |
636 | 696 | | |
637 | | - | |
| 697 | + | |
638 | 698 | | |
639 | 699 | | |
640 | 700 | | |
| |||
725 | 785 | | |
726 | 786 | | |
727 | 787 | | |
728 | | - | |
| 788 | + | |
729 | 789 | | |
730 | 790 | | |
731 | 791 | | |
| 792 | + | |
732 | 793 | | |
733 | 794 | | |
734 | 795 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
| |||
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| 18 | + | |
| 19 | + | |
0 commit comments