Skip to content

[pull] master from microsoft:master#102

Merged
pull[bot] merged 15 commits into
cgallred:masterfrom
microsoft:master
May 22, 2026
Merged

[pull] master from microsoft:master#102
pull[bot] merged 15 commits into
cgallred:masterfrom
microsoft:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 22, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

erickulcyk and others added 15 commits May 11, 2026 14:13
Previously, FastFetch detected case-only renames (e.g. TestFolder -> testfolder)
but suppressed them with a warning, leaving the working directory with stale
casing. This change makes FastFetch propagate case renames to disk, matching
the behavior of regular git checkout.

Changes:
- DiffTreeResult: Add SourcePath property to carry old-cased path for
  directory case renames
- DiffHelper: Detect case-only renames and flow them through the pipeline
  instead of suppressing. Track case-renamed directories to properly filter
  child operations in FlushStagedQueues
- CheckoutStage: Perform two-step directory rename (old -> temp -> new) to
  work around Directory.Move() failing on case-only renames on Windows
- Update unit tests to verify SourcePath, delete counts, and directory
  operation filtering for case renames
- Update functional test to verify casing with ignoreCase: false

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Context:
EnqueueOperationsFromDiffTreeLine handles the case where adding a new
DiffTreeResult to stagedDirectoryOperations collides with an already-
staged entry under the case-insensitive DiffTreeByNameComparer. When
the collision happens, the code needs to find the existing staged
entry (by case-insensitive path equality) so it can capture the old
casing for a directory case rename.

Today the Modify/Add branch is the only collision branch that performs
that lookup, and the lookup is written inline as a foreach scan over
the staged set. A symmetric Delete-after-Add ordering also needs the
same lookup, so the inline pattern is about to be duplicated.

Justification:
Extracting the lookup before introducing the second caller keeps the
upcoming bugfix focused on the behavior change. The helper has a
single responsibility — find a staged DiffTreeResult by case-
insensitive TargetPath — which is meaningful on its own and easy to
test in isolation if needed.

Mutating an item already in the HashSet remains safe because
DiffTreeByNameComparer keys on TargetPath only; SourcePath (the field
the callers will mutate next) is not part of the hash or equality.

Implementation:
Adds private FindStagedDirectoryOperation(string targetPath) which
returns the first staged DiffTreeResult whose TargetPath matches under
GVFSPlatform.Instance.Constants.PathComparison, or null if none. The
existing inline foreach in the Modify/Add collision branch is replaced
with a call to this helper. No observable behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
Branch eric-pr added handling for case-only directory renames in
FastFetch on Windows: when diff-tree reports a Delete and an Add for
the same case-insensitive path, DiffHelper collapses the pair into a
single Add carrying the old-cased path on SourcePath, and CheckoutStage
performs a two-step Directory.Move to apply the casing change on disk.

That logic lives in the Modify/Add collision branch of
EnqueueOperationsFromDiffTreeLine: when an Add arrives and finds a
Delete already staged for the same path, it captures the Delete's
TargetPath onto SourcePath. The symmetric Delete branch was a no-op
with a comment claiming "diff-tree outputs Deletes before Adds".

That assumption is wrong. git diff-tree compares tree entries by byte
order of the path name, so for a case-only rename the Add or Delete
can appear first depending on which casing sorts lower. For example
"GVFLT_*" -> "GVFlt_*" emits the Delete (uppercase 'L', 0x4C) before
the Add (lowercase 'l', 0x6C), but the reverse rename "GVFlt_*" ->
"GVFLT_*" emits the Add first. In the latter case the Delete branch
fires, the staged Add is left without a SourcePath, and CheckoutStage
falls back to plain Directory.CreateDirectory — the original-cased
directory is never renamed and remains on disk.

Justification:
Two approaches were considered: (a) buffer the entire diff-tree
output and sort it before parsing, forcing a stable ordering invariant;
(b) make the staging code order-independent so it handles either
arrival order correctly.

Option (a) regresses memory: FastFetch streams diff-tree line-by-line
today and is run against multi-million-file repos, so buffering the
whole output is meaningful. Option (a) also relies on a git emit-order
invariant that git does not document — making the C# layer robust is
the right place to defend against future diff-tree changes.

Option (b) is the smaller, more localized fix and is implemented here.

Implementation:
EnqueueOperationsFromDiffTreeLine's Delete-collision branch now mirrors
the Modify/Add branch: if a HashSet collision under
DiffTreeByNameComparer indicates an Add was already staged for the
same case-insensitive path, find that Add via
FindStagedDirectoryOperation and — when the existing TargetPath
differs from the incoming TargetPath under ordinal comparison —
annotate the staged Add's SourcePath with the old (incoming Delete's)
casing, and register the old-cased path in caseRenamedDirectoryDeletes
so FlushStagedQueues can filter children of the renamed directory.

The staged entry is mutated in place. This is safe because
DiffTreeByNameComparer hashes and compares only on TargetPath; the
SourcePath field is a runtime annotation that the comparer does not
see, so mutation does not corrupt the HashSet.

A new fixture caseChangeAddFirst.txt exercises the reverse-direction
rename ("GVFlt_*" -> "GVFLT_*") with the Add lines emitted before the
Delete lines, and the matching test ParsesCaseChangesWhenAddPrecedesDelete
asserts the staged directory operation carries the correct SourcePath
and that child operations are filtered exactly as in the Delete-first
case. Without this fix, the test fails because SourcePath is null.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
CheckoutStage performs case-only directory renames via a two-step
Directory.Move: source -> temp (a sibling of the target named with
a random GUID suffix), then temp -> target. The two-step is required
on Windows because Directory.Move throws IOException when the source
and destination differ only in case.

If the first move succeeds but the second fails — for example because
an antivirus or indexer has opened a handle on the moved tree, the
target path has unexpected ACLs, or transient I/O fails the second
call — the existing catch handler logs the error, sets HasFailures,
and leaves the directory stranded at the temp path. A subsequent
FastFetch run sees no source at the original casing and falls through
to Directory.CreateDirectory on the target path. The end state is an
empty target directory next to a "*_caseRename_<guid>" directory
holding the user's actual content, with no automatic recovery.

Justification:
The cleanest recovery is to undo the half-applied rename: if the
second move throws, attempt to move the tree back from the temp path
to the original source casing. A subsequent run then sees the same
state as before the failed attempt and can retry deterministically.

The restoration attempt itself can fail (same underlying I/O issue
that broke the second move) so it is best-effort and swallows
exceptions, leaving the outer catch handler to log the original
failure. Compared to alternatives — staging the rename under .git/
instead of as a sibling, or running a cleanup pass for stranded temp
directories on startup — this is the smallest change that addresses
the data-loss path without introducing new state.

Implementation:
The two Directory.Move calls are wrapped so that a failure of the
second Move triggers a best-effort Directory.Move(tempPath,
trimmedSourcePath). Guarded by Directory.Exists checks so we never
clobber a partially-restored source. After the recovery attempt, the
original exception is re-thrown so the outer catch logs the actual
failure (not a synthetic restore-success). The trimmed source and
target paths are now hoisted to locals to keep the inner block
readable.

No behavior change on the success path or for non-rename Adds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
EnqueueFileAddOperation iterates stagedFileDeletes to decide whether
a staged delete with a matching path differs from the incoming add
only in case. The loop captures the matched delete path into a local
named existingDeletePath, but nothing after the loop reads that local
— only exactMatch is consumed.

Justification:
Dead state is misleading: a future reader expects an unused capture
to be the seed for a planned use (logging the old casing, returning
it from a helper) and may build new code around it. Removing the
local makes the loop's actual contract — set exactMatch, then break —
self-evident.

Implementation:
Drops the existingDeletePath declaration and the one assignment that
populated it. No behavior change; tests continue to pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
DiffHelper accumulates diff results into internal staged sets
(stagedDirectoryOperations, stagedFileDeletes, filesAdded, and the
new caseRenamedDirectoryDeletes) and into four public output
collections: DirectoryOperations, FileDeleteOperations,
FileAddOperations, and RequiredBlobs.

FlushStagedQueues calls RequiredBlobs.CompleteAdding() at the end of
each diff, which permanently closes the BlockingCollection. A second
call to PerformDiff or ParseDiffFile on the same instance would also
flow values into the now-frozen internal staged sets, which were
never cleared between runs. caseRenamedDirectoryDeletes in particular
would carry forward old-cased paths from the first diff and
mis-filter children in the second.

Today every caller in the tree constructs a fresh DiffHelper per
diff, so the bug is latent. But the class never declared this
constraint, the field initializers and the BlockingCollection
together silently assume single-use, and a future caller could
re-enter the methods expecting a usable result.

Justification:
Two approaches were considered: (a) reset internal staged sets at
the start of each call and reinstantiate the output collections, or
(b) detect reuse and throw at the entry points.

Option (a) would invalidate any external code holding references to
the output collections (they are exposed as public properties and
consumed concurrently by other prefetch stages), so it would silently
change semantics for anyone observing the queues across the boundary.
Option (b) preserves the existing single-use intent, surfaces misuse
loudly and early instead of via an obscure BlockingCollection
exception deep in parsing, and requires no behavior change for
current callers.

Implementation:
Adds a diffPerformed bool, set at the start of EnsureSingleUse — the
new private helper called at the top of PerformDiff(string, string)
and ParseDiffFile. A second invocation throws InvalidOperationException
with a message that names the class and suggests constructing a new
instance. The one-arg PerformDiff overload delegates to the two-arg
overload and so is covered transitively.

A new unit test DiffHelperThrowsOnReuse calls ParseDiffFile twice on
the same instance and asserts the second call throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
The internal HashSet introduced for case-rename support was named
caseRenamedDirectoryDeletes, but it does not store Delete operations
or paths of directories slated for deletion. It stores the old-cased
TargetPath of each directory whose Delete entry was collapsed into a
case-rename Add — paths that FlushStagedQueues uses to recognize and
filter out child operations that would otherwise look like they
belong to a deleted directory.

The mismatch between the name and the contents made the two read
sites — FlushStagedQueues's parent-path check and the Modify/Add and
Delete collision branches — harder to follow on first reading. The
field is also referenced in the comment chain of two new commits, so
readers chasing context bounce between "deletes" in the name and
"replaced by a rename" in the comments.

Justification:
Renaming clarifies what the set represents at every call site. The
new name "directoriesReplacedByCaseRename" reads as a description of
its contents and aligns with the comment that already explains it.
The XML-comment block above the field is also expanded so the
contract — *old-cased paths, used to suppress child operations* —
is documented in one place instead of being implied across three
unrelated branches.

Implementation:
Mechanical rename of all three references (the field declaration,
the UnionWith call in FlushStagedQueues, the Add calls in the
Modify/Add and Delete collision branches). No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
DiffTreeResult.SourcePath was introduced as a parser annotation:
DiffHelper sets it when it collapses a Delete+Add pair under the
case-insensitive comparer into a single case-rename Add, and
CheckoutStage reads it to perform the rename on disk. The setter
was declared public, which exposed a runtime-only field as part of
the type's external contract.

Public mutability invites future code outside DiffHelper to set the
field on operations the parser would never have annotated — for
example on a Delete or a Modify entry, or on a file operation — and
CheckoutStage's branches would then behave in ways that were never
validated. The XML doc previously said only "Used on case-
insensitive file systems to carry the old-cased path for directory
renames," which did not communicate the intended single-producer
contract.

Justification:
Restricting the setter to the assembly keeps the producer
(DiffHelper) able to write it while ruling out external producers.
The getter stays public because consumers in other assemblies
(CheckoutStage in FastFetch, the existing DiffHelperTests
assertions) need to read it. The single-producer rule is also
documented in an expanded XML comment that names the parser and
describes when the field is null versus non-null.

Implementation:
Changes "public string SourcePath { get; set; }" to
"public string SourcePath { get; internal set; }" and rewrites the
XML doc to describe the contract: who sets it, who reads it, when
it is null. No build-time consumer outside the assembly was writing
to it (only DiffHelper writes; CheckoutStage and the unit tests
read), so this is a non-breaking visibility tightening.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fHelper

Context:
Case-rename detection needs both case-insensitive containment (do we
have *some* entry matching this path under PathComparer?) and
case-sensitive comparison (does the stored casing exactly match the
incoming path?). HashSet provides the first in O(1) but the second
required a foreach scan of the entire set because HashSet has no API
to retrieve the stored value from a case-insensitive lookup.

Three diff-tree code paths inherited that scan:

  - EnqueueFileAddOperation, scanning stagedFileDeletes when an add
    arrives whose path matches a staged delete case-insensitively
  - EnqueueFileDeleteOperation, scanning filesAdded for the same
    reason on the inverse direction
  - The Modify/Add and Delete collision branches in
    EnqueueOperationsFromDiffTreeLine, scanning stagedDirectoryOperations
    via FindStagedDirectoryOperation to recover the staged
    DiffTreeResult and its TargetPath casing

These scans are gated by a case-insensitive Contains check, so the
inner loop only runs when there is actually a collision — but on each
collision it walks every staged entry. The staged sets can reach
millions of paths on first checkouts of large repositories (the
Windows monorepo, Office, etc.), so a worst-case parse with many
collisions degrades from O(N) to O(N^2).

Justification:
Dictionary<string, T> keyed by GVFSPlatform.Instance.Constants.PathComparer
solves both needs at once: the comparer drives O(1) case-insensitive
key matching, and TryGetValue returns the originally stored value so
callers can compare its casing ordinally without iterating. The
storage shape lines up exactly with what the existing logic needs.

DiffTreeByNameComparer becomes unused after stagedDirectoryOperations
is rekeyed by TargetPath string, so it is removed in this commit
along with FindStagedDirectoryOperation (now a single TryGetValue
call). No public surface area changes.

Implementation:
  - filesAdded, stagedDirectoryOperations, and stagedFileDeletes are
    declared as Dictionary instances. The two string-valued
    dictionaries map a path under the case-insensitive comparer to
    its original casing; the directory-valued dictionary maps the
    case-insensitive path to its staged DiffTreeResult.
  - HashSet.Add returning false becomes Dictionary.TryAdd returning
    false at every call site. The Remove+Add idiom used to "replace"
    a staged entry becomes a single indexer assignment.
  - Two foreach scans collapse into TryGetValue + ordinal compare;
    the collision branches in EnqueueOperationsFromDiffTreeLine
    consult the dictionary directly via the indexer because the
    immediately preceding TryAdd has already proven the key is
    present.
  - FlushStagedQueues iterates .Values for both stagedDirectoryOperations
    and stagedFileDeletes.
  - The dead helper FindStagedDirectoryOperation and the
    DiffTreeByNameComparer inner class are deleted.

All tests continue to pass; no behavior change is intended beyond
the performance characteristic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
The original case-rename commit on this branch removed three
RelatedEvent(Warning, "CaseConflict", ...) calls from the diff-tree
parsing path. Those traces had two jobs: signaling that a case-only
rename had been observed, and signaling a true duplicate diff entry
for the same path. After removal, both outcomes were silent — case
renames now happened invisibly, and any genuine diff anomaly that
collapsed into a "case rename" was undetectable in logs.

Silent success is mildly annoying for an operator; silent anomaly is
a debuggability hole, because the same code path now handles two
distinct conditions and the original logs no longer distinguish them.

Justification:
The two conditions deserve different log levels: a case rename is
expected on case-insensitive filesystems and should be Informational
so the operator can count them per fetch without alarm; a true
duplicate (incoming and staged paths match ordinally) is rare and
suggests a malformed diff stream, so it should be Warning so it
surfaces in normal triage.

The original "CaseConflict" event name conflated both. Replacing it
with two distinct events ("CaseRename" and "DuplicateDiffEntry")
gives each its own grep target without breaking the surviving
ls-tree path's "CaseConflict" event (which keeps its original
meaning — two sibling entries in the same tree differing only in
case).

Implementation:
Two small static helpers — TraceCaseRename and TraceDuplicate —
build EventMetadata with the kind ("File" or "Directory"), the
operation, and the old/new paths or the conflicting path. Each
collision branch in EnqueueOperationsFromDiffTreeLine and the
case-rename branches in EnqueueFileAddOperation /
EnqueueFileDeleteOperation now call one of the helpers depending
on whether the comparison is ordinally equal or not.

No behavior change beyond observability: the staged data structures,
the FlushStagedQueues output, and the existing unit tests are
unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…yRename

Context:
HandleAllDirectoryOperations dequeues directory operations from the
diff and dispatches them by Operation kind. The Modify/Add arm grew
to a ~50-line inline body covering three subcases: a case-only
rename (two-step Directory.Move through a temp path, plus rollback
if the second move fails), a fallback for missing source (a parent
rename already moved this child), and the plain Add path.

That body sits inside a try/catch that also handles the plain Add,
and the catch block walks the same treeOp.SourcePath branches to
shape its EventMetadata. As the rename logic grew (rollback on
failure, separate trimmed-path locals, nested try/catch), the
overall switch arm became hard to scan: a reader has to mentally
peel four levels of nesting to find the simple "is this an Add or
a rename?" decision.

Justification:
The rename has a clear contract — given a DiffTreeResult with
SourcePath set, apply it on disk — that fits naturally into a
private method. Extracting it leaves the switch arm at one level of
nesting and keeps the catch block at the call site so its
EventMetadata shaping does not duplicate inside the helper.

The behavior of the existing branches is preserved exactly,
including the rollback-then-rethrow path so the outer catch still
sees the original exception (not a synthesized restore failure).

Implementation:
ApplyCaseOnlyDirectoryRename takes the DiffTreeResult and the
already-computed absoluteTargetPath. It returns early after a
Directory.CreateDirectory fallback when the source is missing, and
otherwise performs the trimmed-path two-step Move with the inner
rollback try/catch. The caller's switch arm now reads as:

    if (treeOp.SourcePath != null)
    {
        this.ApplyCaseOnlyDirectoryRename(treeOp, absoluteTargetPath);
    }
    else
    {
        Directory.CreateDirectory(absoluteTargetPath);
    }

No tests were added; all 796 existing tests continue to pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
Case-rename detection lives in three places in DiffHelper:
EnqueueOperationsFromDiffTreeLine's directory branches,
EnqueueFileAddOperation, and EnqueueFileDeleteOperation. The
existing fixtures (caseChange.txt, caseChangeAddFirst.txt) exercise
only the directory branches — their test files all sit inside a
case-renamed directory, so the file-level case-rename paths are
indirectly suppressed by the parent-directory filtering in
FlushStagedQueues.

That leaves the standalone "rename foo.txt to FOO.txt with no
directory casing changes" path completely uncovered. Both
EnqueueFileAddOperation's case-rename branch and
EnqueueFileDeleteOperation's case-rename branch ship without a
test that exercises them end-to-end.

Justification:
A two-line fixture isolates the file-level path: one Delete entry
for the old casing, one Add entry for the new casing, no directory
operations. The resulting test pins the contract that DiffHelper
both stages the delete (old casing, so the on-disk file is removed)
and stages the add (new casing, so the replacement file is
written), and that neither is filtered out as a duplicate.

Implementation:
New fixture Data/fileCaseChange.txt with a Delete for "foo.txt"
followed by an Add for "FOO.txt" (the typical diff-tree emit order
for that direction since "F" sorts before "f" byte-wise). New unit
test ParsesFileOnlyCaseRename asserts RequiredBlobs has one entry,
FileAddOperations and FileDeleteOperations each have one entry, and
the casings flow through correctly: the delete carries "foo.txt"
and the add carries "FOO.txt". The test is gated to case-insensitive
filesystems because on a case-sensitive filesystem the two paths
are distinct files and no case-rename detection runs (the existing
ParsesCaseChangesAsRenames test covers the case-sensitive
behavior). The csproj is updated so the fixture is copied to the
test output directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Context:
The existing case-rename fixtures change only the top-level
directory's casing. Their children sit inside the renamed parent but
retain the parent's old casing in the diff. Nothing in the existing
test suite exercises a diff where both an outer directory and an
inner directory change casing in the same operation.

That scenario is reachable in production whenever a contributor
renames an entire path's casing (e.g., "Docs/API/" to "docs/api/")
and is the most interesting interaction between the case-rename
detection in EnqueueOperationsFromDiffTreeLine and the parent-path
filter in FlushStagedQueues: the inner rename's parent path is in
directoriesReplacedByCaseRename, so its Add must be suppressed
because the outer rename's Directory.Move already carries the child
on disk. If that suppression breaks, FastFetch would attempt to
CreateDirectory the inner path after the outer rename has moved
the entire subtree — at best a no-op, at worst a layered race.

Justification:
A six-line fixture (three Deletes, three Adds) is the smallest input
that captures the nested case where both renames need to be detected
*and* the inner one needs to be filtered. Asserting all of
FileAddOperations, FileDeleteOperations, TotalFileDeletes,
TotalDirectoryOperations, the enqueued DirectoryOperations count,
and the SourcePath/TargetPath on the surviving op pins the contract
end-to-end.

Implementation:
New Data/nestedCaseChange.txt with a "Outer/Inner/test" source tree
and an "outer/inner/test" target tree. New unit test
ParsesNestedCaseChanges asserts:

  - One blob is required (the file under the renamed subtree)
  - FileAddOperations carries the new-cased path
  - The file delete is filtered out by parent-path matching
  - TotalDirectoryOperations is 2 (both case-renames are staged)
  - DirectoryOperations.Count is 1 (only the outermost survives
    the parent-path filter)
  - The surviving directory op carries the old casing on SourcePath

The test is gated to case-insensitive filesystems because on a
case-sensitive filesystem the two paths are independent and no
case-rename detection runs. The csproj is updated so the fixture is
copied to the test output directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address review of case-rename support: ordering, rollback, observability, performance, tests
Handle case-only file and directory renames in FastFetch on Windows
@pull pull Bot locked and limited conversation to collaborators May 22, 2026
@pull pull Bot added the ⤵️ pull label May 22, 2026
@pull pull Bot merged commit 3d08ce4 into cgallred:master May 22, 2026
1 check passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants