Skip to content

Commit eba684f

Browse files
authored
perf: Replace libgit2 git status with gix-index for faster file hashing (#11950)
## Summary Replaces `RepoGitIndex`'s libgit2-based `git ls-tree` + `git status` with a new code path that reads the `.git/index` file directly via `gix-index`. This eliminates the most expensive git operation in `turbo run` by combining two separate libgit2 calls into a single index read + parallel stat comparison. ## Results **Profile data** (`RepoGitIndex::new`): | Repo | libgit2 (before) | gix-index (after) | Improvement | |---|---|---|---| | Large (~500 packages, ~1700 tasks) | 397.8ms | 296.9ms | **-25%** | **Wall-clock benchmarks** (hyperfine, `--dry --skip-infer`, 10+ warmup, 10+ runs): | Repo | Speedup | |---|---| | Large (~500 packages) | **1.08-1.11x** | | Medium (~120 packages) | **1.20-1.35x** | | Small (~3 packages) | 1.00x | Measured with `--profile` on three private repos of different sizes. All profiles taken on the same machine, same base commit, clean working trees. The medium repo shows the biggest wall-clock improvement because git operations are a larger fraction of total run time. The large repo has a smaller relative improvement because other operations (engine build, lockfile parsing, globwalk) dominate. ## Why `git_status_repo_root` (via libgit2's `repo.statuses()`) was the single most expensive operation in `turbo run`, consuming 30-70% of total profiled duration depending on repo size. It stat-checks every tracked file AND walks the entire working tree for untracked files in a single-threaded C call. ## What Changed **New gix-index code path** (`repo_index.rs`): - Reads `.git/index` via `gix-index` (mmap'd, ~2-5ms) to get every tracked file's blob OID and cached stat data - Stats each tracked file in parallel via rayon, comparing filesystem stat against index stat using `gix_index::entry::Stat::matches()` - Racy-git entries (mtime >= index timestamp) are deferred to per-package `hash_objects` instead of content-hashing inline — avoids reading every file from disk on fresh checkouts - Uses nanosecond timestamp precision (`use_nsec: true`) to reduce false racy entries on modern filesystems (APFS, ext4) - Detects untracked files via the `ignore` crate's parallel walker (respects `.gitignore`) - Falls back to the existing libgit2 path if gix-index fails **Dependency changes:** - Added `gix-index` as an optional dependency behind a `gix` feature flag (~27 new crates, all pure Rust) **Optimizations applied:** - Removed redundant sort of `ls_tree_hashes` (git index is already sorted, rayon preserves order) - Deferred OID hex conversion — raw `ObjectId` carried through the parallel loop, hex string allocated only for clean entries - Binary search on sorted vecs instead of `HashSet` for untracked file detection **Test coverage:** - 31 regression tests covering equivalence, edge cases (gitignore, symlinks, prefix boundaries, racy-git), and contract guarantees (sorted invariants, OID compatibility, determinism) - Shared test utilities module (`test_utils.rs`)
1 parent 51cb58b commit eba684f

File tree

9 files changed

+1662
-200
lines changed

9 files changed

+1662
-200
lines changed

Cargo.lock

Lines changed: 496 additions & 96 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,7 @@ dunce = "1.0.3"
124124
either = "1.9.0"
125125
futures = "0.3.31"
126126
git2 = { version = "0.20.4", default-features = false }
127+
gix-index = { version = "0.47.0", default-features = false }
127128
hex = "0.4.3"
128129
httpmock = { version = "0.8.0", default-features = false }
129130
indicatif = "0.18.3"

crates/turborepo-auth/src/auth/mod.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -119,12 +119,12 @@ mod tests {
119119
// Mock the turborepo_dirs functions for testing
120120
fn create_mock_vercel_config_dir() -> AbsoluteSystemPathBuf {
121121
let tmp_dir = tempdir().expect("Failed to create temp dir");
122-
AbsoluteSystemPathBuf::try_from(tmp_dir.into_path()).expect("Failed to create path")
122+
AbsoluteSystemPathBuf::try_from(tmp_dir.keep()).expect("Failed to create path")
123123
}
124124

125125
fn create_mock_turbo_config_dir() -> AbsoluteSystemPathBuf {
126126
let tmp_dir = tempdir().expect("Failed to create temp dir");
127-
AbsoluteSystemPathBuf::try_from(tmp_dir.into_path()).expect("Failed to create path")
127+
AbsoluteSystemPathBuf::try_from(tmp_dir.keep()).expect("Failed to create path")
128128
}
129129

130130
fn setup_auth_file(

crates/turborepo-lib/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ turborepo-profile-md = { workspace = true }
127127
turborepo-repository = { path = "../turborepo-repository" }
128128
turborepo-run-cache = { path = "../turborepo-run-cache" }
129129
turborepo-run-summary = { workspace = true }
130-
turborepo-scm = { workspace = true, features = ["git2"] }
130+
turborepo-scm = { workspace = true, features = ["git2", "gix"] }
131131
turborepo-scope = { path = "../turborepo-scope" }
132132
turborepo-shim = { workspace = true }
133133
turborepo-signals = { workspace = true }

crates/turborepo-scm/Cargo.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ workspace = true
1212
[dependencies]
1313
bstr = "1.4.0"
1414
git2 = { workspace = true, default-features = false, optional = true }
15+
gix-index = { workspace = true, default-features = false, optional = true }
1516
globwalk = { path = "../turborepo-globwalk" }
1617
hex = { workspace = true }
1718
ignore = "0.4.20"
@@ -31,8 +32,10 @@ which = { workspace = true }
3132

3233
[dev-dependencies]
3334
git2 = { workspace = true, default-features = false }
35+
gix-index = { workspace = true, default-features = false }
3436
tempfile = { workspace = true }
3537
test-case = { workspace = true }
3638

3739
[features]
3840
git2 = ["dep:git2"]
41+
gix = ["dep:gix-index"]

0 commit comments

Comments
 (0)