Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PATH WALK II: Add --path-walk option to 'git pack-objects' #1819

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Documentation/config/feature.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ walking fewer objects.
+
* `pack.allowPackReuse=multi` may improve the time it takes to create a pack by
reusing objects from multiple packs instead of just one.
+
* `pack.usePathWalk` may speed up packfile creation and make the packfiles be
significantly smaller in the presence of certain filename collisions with Git's
default name-hash.

feature.manyFiles::
Enable config options that optimize for repos with many files in the
Expand Down
8 changes: 8 additions & 0 deletions Documentation/config/pack.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,14 @@ pack.useSparse::
commits contain certain types of direct renames. Default is
`true`.

pack.usePathWalk::
When true, git will default to using the '--path-walk' option in
'git pack-objects' when the '--revs' option is present. This
algorithm groups objects by path to maximize the ability to
compute delta chains across historical versions of the same
object. This may disable other options, such as using bitmaps to
enumerate objects.

pack.preferBitmapTips::
When selecting which commits will receive bitmaps, prefer a
commit at the tip of any reference that is a suffix of any value
Expand Down
25 changes: 18 additions & 7 deletions Documentation/git-pack-objects.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ SYNOPSIS
--------
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Taylor Blau wrote (reply to this):

On Mon, Mar 10, 2025 at 01:50:45AM +0000, Derrick Stolee via GitGitGadget wrote:
> ---
>  Documentation/git-pack-objects.adoc | 14 +++++++-------
>  builtin/pack-objects.c              | 10 ++++++++--
>  t/t0450/adoc-help-mismatches        |  1 -
>  3 files changed, 15 insertions(+), 10 deletions(-)

Thanks for cleaning these up.

Thanks,
Taylor

[verse]
'git pack-objects' [-q | --progress | --all-progress] [--all-progress-implied]
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
[--local] [--incremental] [--window=<n>] [--depth=<n>]
[--revs [--unpacked | --all]] [--keep-pack=<pack-name>]
[--cruft] [--cruft-expiration=<time>]
[--stdout [--filter=<filter-spec>] | <base-name>]
[--shallow] [--keep-true-parents] [--[no-]sparse]
[--name-hash-version=<n>] < <object-list>
[--no-reuse-delta] [--delta-base-offset] [--non-empty]
[--local] [--incremental] [--window=<n>] [--depth=<n>]
[--revs [--unpacked | --all]] [--keep-pack=<pack-name>]
[--cruft] [--cruft-expiration=<time>]
[--stdout [--filter=<filter-spec>] | <base-name>]
[--shallow] [--keep-true-parents] [--[no-]sparse]
[--name-hash-version=<n>] [--path-walk] < <object-list>


DESCRIPTION
Expand Down Expand Up @@ -375,6 +375,17 @@ many different directories. At the moment, this version is not allowed
when writing reachability bitmap files with `--write-bitmap-index` and it
will be automatically changed to version `1`.

--path-walk::
By default, `git pack-objects` walks objects in an order that
presents trees and blobs in an order unrelated to the path they
appear relative to a commit's root tree. The `--path-walk` option
enables a different walking algorithm that organizes trees and
blobs by path. This has the potential to improve delta compression
especially in the presence of filenames that cause collisions in
Git's default name-hash algorithm. Due to changing how the objects
are walked, this option is not compatible with `--delta-islands`,
`--shallow`, or `--filter`.


DELTA ISLANDS
-------------
Expand Down
14 changes: 13 additions & 1 deletion Documentation/git-repack.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ SYNOPSIS
[verse]
'git repack' [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]
[--window=<n>] [--depth=<n>] [--threads=<n>] [--keep-pack=<pack-name>]
[--write-midx] [--name-hash-version=<n>]
[--write-midx] [--name-hash-version=<n>] [--path-walk]

DESCRIPTION
-----------
Expand Down Expand Up @@ -255,6 +255,18 @@ linkgit:git-multi-pack-index[1]).
Provide this argument to the underlying `git pack-objects` process.
See linkgit:git-pack-objects[1] for full details.

--path-walk::
This option passes the `--path-walk` option to the underlying
`git pack-options` process (see linkgit:git-pack-objects[1]).
By default, `git pack-objects` walks objects in an order that
presents trees and blobs in an order unrelated to the path they
appear relative to a commit's root tree. The `--path-walk` option
enables a different walking algorithm that organizes trees and
blobs by path. This has the potential to improve delta compression
especially in the presence of filenames that cause collisions in
Git's default name-hash algorithm. Due to changing how the objects
are walked, this option is not compatible with `--delta-islands`
or `--filter`.

CONFIGURATION
-------------
Expand Down
9 changes: 9 additions & 0 deletions Documentation/technical/api-path-walk.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,14 @@ better off using the revision walk API instead.
the revision walk so that the walk emits commits marked with the
`UNINTERESTING` flag.

`edge_aggressive`::
For performance reasons, usually only the boundary commits are
explored to find UNINTERESTING objects. However, in the case of
shallow clones it can be helpful to mark all trees and blobs
reachable from UNINTERESTING tip commits as UNINTERESTING. This
matches the behavior of `--objects-edge-aggressive` in the
revision API.

`pl`::
This pattern list pointer allows focusing the path-walk search to
a set of patterns, only emitting paths that match the given
Expand All @@ -69,4 +77,5 @@ Examples

See example usages in:
`t/helper/test-path-walk.c`,
`builtin/pack-objects.c`,
`builtin/backfill.c`
Loading
Loading