Skip to content

Comments

perf: Replace libgit2 git status with gix-index for faster file hashing#11950

Merged
anthonyshew merged 5 commits intomainfrom
faster-and-faster
Feb 21, 2026
Merged

perf: Replace libgit2 git status with gix-index for faster file hashing#11950
anthonyshew merged 5 commits intomainfrom
faster-and-faster

Conversation

@anthonyshew
Copy link
Contributor

@anthonyshew anthonyshew commented Feb 21, 2026

Summary

Replaces RepoGitIndex's libgit2-based git ls-tree + git status with a new code path that reads the .git/index file directly via gix-index. This eliminates the most expensive git operation in turbo run by combining two separate libgit2 calls into a single index read + parallel stat comparison.

Results

Profile data (RepoGitIndex::new):

Repo libgit2 (before) gix-index (after) Improvement
Large (~1000 packages, ~1700 tasks) 397.8ms 296.9ms -25%

Wall-clock benchmarks (hyperfine, --dry --skip-infer, 10+ warmup, 10+ runs):

Repo Speedup
Large (~1000 packages) 1.08-1.11x
Medium (~120 packages) 1.20-1.35x
Small (~3 packages) 1.00x

Measured with --profile on three private repos of different sizes. All profiles taken on the same machine, same base commit, clean working trees.

The medium repo shows the biggest wall-clock improvement because git operations are a larger fraction of total run time. The large repo has a smaller relative improvement because other operations (engine build, lockfile parsing, globwalk) dominate.

Why

git_status_repo_root (via libgit2's repo.statuses()) was the single most expensive operation in turbo run, consuming 30-70% of total profiled duration depending on repo size. It stat-checks every tracked file AND walks the entire working tree for untracked files in a single-threaded C call.

What Changed

New gix-index code path (repo_index.rs):

  • Reads .git/index via gix-index (mmap'd, ~2-5ms) to get every tracked file's blob OID and cached stat data
  • Stats each tracked file in parallel via rayon, comparing filesystem stat against index stat using gix_index::entry::Stat::matches()
  • Racy-git entries (mtime >= index timestamp) are deferred to per-package hash_objects instead of content-hashing inline — avoids reading every file from disk on fresh checkouts
  • Uses nanosecond timestamp precision (use_nsec: true) to reduce false racy entries on modern filesystems (APFS, ext4)
  • Detects untracked files via the ignore crate's parallel walker (respects .gitignore)
  • Falls back to the existing libgit2 path if gix-index fails

Dependency changes:

  • Added gix-index as an optional dependency behind a gix feature flag (~27 new crates, all pure Rust)

Optimizations applied:

  • Removed redundant sort of ls_tree_hashes (git index is already sorted, rayon preserves order)
  • Deferred OID hex conversion — raw ObjectId carried through the parallel loop, hex string allocated only for clean entries
  • Binary search on sorted vecs instead of HashSet for untracked file detection

Test coverage:

  • 31 regression tests covering equivalence, edge cases (gitignore, symlinks, prefix boundaries, racy-git), and contract guarantees (sorted invariants, OID compatibility, determinism)
  • Shared test utilities module (test_utils.rs)

@anthonyshew anthonyshew requested a review from a team as a code owner February 21, 2026 05:09
@anthonyshew anthonyshew requested review from tknickman and removed request for a team February 21, 2026 05:09
@vercel
Copy link
Contributor

vercel bot commented Feb 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
examples-basic-web Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
examples-designsystem-docs Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
examples-gatsby-web Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
examples-kitchensink-blog Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
examples-nonmonorepo Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
examples-svelte-web Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
examples-tailwind-web Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
examples-vite-web Building Building Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
turbo-site Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
turborepo-agents Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm
turborepo-test-coverage Ready Ready Preview, Comment, Open in v0 Feb 21, 2026 6:18pm

@anthonyshew anthonyshew changed the title perf: Replace libgit2 git status with gix-index for faster file hashing perf: Replace libgit2 git status with gix-index for faster file hashing Feb 21, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 21, 2026

Coverage Report

Metric Coverage
Lines 75.08%
Functions 46.78%
Branches 0.00%

View full report

@anthonyshew anthonyshew merged commit eba684f into main Feb 21, 2026
102 of 103 checks passed
@anthonyshew anthonyshew deleted the faster-and-faster branch February 21, 2026 18:33
github-actions bot added a commit that referenced this pull request Feb 21, 2026
## Release v2.8.11-canary.18

Versioned docs: https://v2-8-11-canary-18.turborepo.dev

### Changes

- release(turborepo): 2.8.11-canary.17 (#11949) (`51cb58b`)
- perf: Replace `libgit2` git status with `gix-index` for faster file
hashing (#11950) (`eba684f`)

---------

Co-authored-by: Turbobot <turbobot@vercel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant