Commit 4a7eb59
[ENG-3089] Relax bbolt cache durability to fix goroutine pileup (#4221)
## Summary
- Skip `fdatasync` on every bbolt commit in the sdpcache layer by
setting `NoSync: true` and `NoFreelistSync: true`
- Eliminates the single-writer bottleneck that cascaded into 36K stuck
goroutines under load (bbolt serializes write transactions behind
fdatasync, ~930 pool workers were queuing on the write lock)
- Safe because sdpcache is a pure cache -- crash durability provides no
value
## Linear Ticket
- **Ticket**:
[ENG-3089](https://linear.app/overmind/issue/ENG-3089/relax-bbolt-cache-durability-to-fix-goroutine-pileup-under-load)
— Relax bbolt cache durability to fix goroutine pileup under load
- **Purpose**: Remove the fdatasync bottleneck that causes cascading
goroutine pileups when sources are under heavy query load
- **Priority**: Urgent
## Changes
Single file change in `go/sdpcache/bolt_cache.go`:
1. Added a package-level `cacheOpenOptions` variable with `NoSync:
true`, `NoFreelistSync: true`, and the existing `Timeout: 5s`
2. Replaced all 7 inline `&bbolt.Options{Timeout: 5 * time.Second}` with
`cacheOpenOptions`
All sdpcache tests pass. No API or behavioral change -- only the fsync
guarantee is relaxed, which is irrelevant for a cache.
Made with [Cursor](https://cursor.com)
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Disables bbolt fsync and freelist syncing for the cache DB, which can
improve throughput but increases the chance of cache corruption/loss
after crashes or power failures.
>
> **Overview**
> Introduces a shared `cacheOpenOptions` for all `bbolt.Open` calls in
`sdpcache` that sets `NoSync` and `NoFreelistSync` (keeping the existing
5s `Timeout`) to avoid per-commit `fdatasync` overhead.
>
> All cache DB open/reopen paths (startup, deletion recovery, and
compaction temp DB creation/reopen) are updated to use these relaxed
durability settings.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
60297db5a8b4e78b25a685c30ad8242f84ed17b1. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
GitOrigin-RevId: 4de3a30393489e86e14f68b16c8bbd81b8f1f3da1 parent 1594265 commit 4a7eb59
1 file changed
Lines changed: 17 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
34 | 44 | | |
35 | 45 | | |
36 | 46 | | |
| |||
234 | 244 | | |
235 | 245 | | |
236 | 246 | | |
237 | | - | |
238 | | - | |
239 | | - | |
| 247 | + | |
240 | 248 | | |
241 | 249 | | |
242 | 250 | | |
| |||
457 | 465 | | |
458 | 466 | | |
459 | 467 | | |
460 | | - | |
| 468 | + | |
461 | 469 | | |
462 | 470 | | |
463 | 471 | | |
| |||
1341 | 1349 | | |
1342 | 1350 | | |
1343 | 1351 | | |
1344 | | - | |
| 1352 | + | |
1345 | 1353 | | |
1346 | 1354 | | |
1347 | 1355 | | |
1348 | 1356 | | |
1349 | 1357 | | |
1350 | | - | |
| 1358 | + | |
1351 | 1359 | | |
1352 | 1360 | | |
1353 | 1361 | | |
| |||
1364 | 1372 | | |
1365 | 1373 | | |
1366 | 1374 | | |
1367 | | - | |
| 1375 | + | |
1368 | 1376 | | |
1369 | 1377 | | |
1370 | 1378 | | |
| |||
1395 | 1403 | | |
1396 | 1404 | | |
1397 | 1405 | | |
1398 | | - | |
| 1406 | + | |
1399 | 1407 | | |
1400 | 1408 | | |
1401 | 1409 | | |
1402 | 1410 | | |
1403 | | - | |
| 1411 | + | |
1404 | 1412 | | |
1405 | 1413 | | |
1406 | 1414 | | |
| |||
0 commit comments