You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cache directory entries to avoid expensive filepath.Glob in index-related ops (#18939)
## Summary
- Added `MatchVersionedFile` to search pre-scanned directory entries
instead of per-file `filepath.Glob` calls
- Updated `Domain.OpenList` to accept `ScanDirsResult` struct instead of
individual arrays
- Pre-scans directory entries once upfront to avoid repeated filesystem
calls when opening dirty files
- `CaplinSnapshots.OpenList` now uses `snaptype.IdxFiles` instead of
`os.ReadDir` for efficiency
- Added test for `MatchVersionedFile` handling seg/idx with different
base names (blobsidecars.seg → blocksidecars.idx)
- `SnapshotRepo.openDirtyFiles` now uses `MatchVersionedFile` with
pre-scanned entries
## Some numbers
gnosis has large number of caplin/block files...offline commands/erigon
startup had gotten slow because of `filepath.Globs`..
a `integration stage_exec` before/after:
```
real 1m38.648s -> 0m27.946s
user 2m39.027s -> 1m26.677s
sys 0m14.967s -> 0m2.525s
```
## In Future PRs
- [ ] `fileItemsWithMissedAccessors` in `db/state/dirty_files.go:777` -
uses `dir.FileExist` per accessor, could use pre-scanned entries instead
- [ ] If `FindFilesWithVersionsByPattern` can be fully replaced, move
supported version check into `MatchVersionedFile`
- [ ] Let `openFolder`/`openList` in InvertedIndex/Domain/History accept
`ScanDirsResult` directly
- [ ] Build missed accessors can use `MatchVersionedFile`
- [ ] Rename `BuildMissingIndices` to `BuildMissedAccessors` in
caplin/RoSnapshots for consistency with rest of codebase
- [ ] Snaptype operations: `Index.HasFile`, `SnapType.FileExist`,
`ParseFromFile` in `db/snaptype/type.go`
- [ ] Block types: body/tx path resolution in
`db/snaptype2/block_types.go`
0 commit comments