Description
Environment
- ggshield version: 1.34.0
- Operating system (Linux, macOS, Windows): Windows
- Operating system version: Windows 11 Pro 23H2
- Python version: 3.12
Describe the bug
When the user is about to commit a merge with many files and long file names, ggshield secret scan pre-commit
crashes with [WinError 206] The filename or extension is too long
. This is because ggshield
is building and executing a git command that can be longer then the maximum path length on Windows, which is approximately 32,767 with "long paths" enabled.
Steps to reproduce:
You can use the following bash script in a Git shell on Windows to reproduce the error:
#!/usr/bin/env bash
cd $(mktemp -d)
git init
function commit_files() {
for i in $(seq -f "%0200g" 200)
do
echo $1 > $i.txt
done
git add .
git commit -m "msg" --no-verify
}
git checkout -b a
commit_files x
git checkout -b b
commit_files foo
git checkout a
commit_files bar
git merge b
git add .
ggshield secret scan pre-commit --verbose --debug
This script creates 200 files with file names that are a bit longer than 200 characters each, creates some changes for each of them across two branches, and finally merges the 200 changes from both branches. The kind of merge within this script is a bit artifical but demonstrates the issue that we observed during real merges.
Actual result:
Traceback (most recent call last):
File "C:\Users\User\pipx\venvs\ggshield\Lib\site-packages\ggshield\cmd\utils\common_decorators.py", line 18, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\pipx\venvs\ggshield\Lib\site-packages\ggshield\cmd\secret\scan\precommit.py", line 85, in precommit_cmd
commit = Commit.from_merge(ctx_obj.exclusion_regexes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\pipx\venvs\ggshield\Lib\site-packages\ggshield\core\scan\commit.py", line 87, in from_merge
shas_in_merge_branch = dict(
^^^^^
File "C:\Users\User\pipx\venvs\ggshield\Lib\site-packages\ggshield\core\scan\commit_utils.py", line 370, in get_file_sha_in_ref
output = git(["ls-tree", "-z", ref] + files, cwd=cwd)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\pipx\venvs\ggshield\Lib\site-packages\ggshield\utils\git_shell.py", line 206, in git
result = subprocess.run(
^^^^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python312\Lib\subprocess.py", line 548, in run
with Popen(*popenargs, **kwargs) as process:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\User\AppData\Local\Programs\Python\Python312\Lib\subprocess.py", line 1026, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\User\AppData\Local\Programs\Python\Python312\Lib\subprocess.py", line 1538, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 206] The filename or extension is too long
20408:6608 [D] ggshield.__main__:56 scan exit_code=128
The full debug log can be found here: error.log
Expected result:
ggshield
scanning for secrets and not crashing.
Possible solutions:
One could start breaking the single ls-tree
command into multiple ones once the command grows too long (depending on platform and "long path" configuration). This might apply for other subprocess executions, too.