You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf: Reduce allocations in SCM hashing, glob preprocessing, and cache lookups (#11916)
## Summary
Reduce heap allocations and syscalls across the hot paths of `turbo
run`. These are mechanical changes — no behavioral differences, no new
APIs.
### Benchmarks (`--dry` runs, `--skip-infer`, 10 runs each, 5 warmup)
| Repo | Packages | Tasks | Before | After | Delta |
|------|----------|-------|--------|-------|-------|
| Large | ~1000 | 1690 | 2.179s ± 0.073s | 2.130s ± 0.049s | **1.02x
faster** |
| Medium | ~120 | ~200 | 1.235s ± 0.098s | 1.216s ± 0.079s | **1.02x
faster** |
| Small | ~5 | ~5 | 834.1ms ± 28.4ms | 816.8ms ± 22.8ms | **1.02x
faster** |
### Changes
**SCM parsing** (`ls_tree.rs`, `status.rs`): `entry.hash.to_vec()` +
`String::from_utf8()` allocated twice per entry (intermediate `Vec<u8>`
then `String`). Now uses `str::from_utf8()` + `.to_owned()` for a single
allocation. BufReader buffer increased from 8KB to 64KB to reduce
`read()` syscalls on large git output.
**File hashing** (`hash_object.rs`): `hashes.reserve(to_hash.len())`
pre-allocates the HashMap to avoid rehashing during insertion.
**Package deps** (`package_deps.rs`): Input globs were cloned to
`String` just to iterate them. Now works with `&str` references and
reuses a single `String` buffer for the `"{package_path}/{glob}"` join
instead of allocating per-iteration via `.join("/")`. Capacity hints
added to `inclusions`, `exclusions`, `to_hash`, and `hashes`
Vecs/HashMaps. When include globs overlap with the git index (the
`$TURBO_DEFAULT$` + explicit inputs case), files already known from the
index are skipped instead of being re-hashed.
**Task hashing** (`lib.rs`): Dependency hash deduplication now collects
`&str` references under the mutex lock instead of cloning each `String`
into the `HashSet`. Owned strings are only allocated after dedup,
halving allocations. Capacity hint added.
**Glob preprocessing** (`globwalk/lib.rs`): Capacity hints on
include/exclude path Vecs. Exclude path processing avoids an unnecessary
`.to_string()` by using `Cow::into_owned()` which is free when the Cow
is already borrowed.
**Cache lookups** (`fs.rs`): `FSCache::exists()` was doing 3 `format!()`
+ `join_component()` allocations per call (called once per task). Now
reuses a single `String` buffer, truncating and re-appending suffixes
for `.tar`, `.tar.zst`, and `-meta.json`.
0 commit comments