Open
Description
Bug Report
Description
With cache.type = copy
, executing dvc add -R
twice clears the executable flags from the added files. The first time, isexec
is correctly added to the corresponding .dvc
file; the second time, the files lose their executable flag and isexec
is removed from the .dvc
file.
Reproduce
#!/bin/bash
# create a repo with two files
mkdir dvc-reprex
cd dvc-reprex
mkdir dvcfiles
cd dvcfiles
echo 123 > f1
echo 234 > f2
cd ..
# Set the two files to executable
chmod a+x dvcfiles/*
# Verify
ls -lap dvcfiles
# Init git and dvc repo with cache.type copy
git init
dvc init
dvc config cache.type copy
git add .dvc/config
git commit -m "setup dvc"
# add the dvcfiles dir
dvc add -R dvcfiles
git add dvcfiles
git commit -m "committed dvcfiles"
# Check that files are still executable:
ls -lap dvcfiles
# All is fine:
# total 20
# drwxr-xr-x 1 zzz zzz 52 Apr 22 15:11 ./
# drwxr-xr-x 1 zzz zzz 52 Apr 22 15:01 ../
# -rwxr--r-- 1 zzz zzz 4 Apr 22 14:59 f1
# -rw-r--r-- 1 zzz zzz 82 Apr 22 15:07 f1.dvc
# -rwxr--r-- 1 zzz zzz 4 Apr 22 15:00 f2
# -rw-r--r-- 1 zzz zzz 82 Apr 22 15:07 f2.dvc
# -rw-r--r-- 1 zzz zzz 8 Apr 22 15:07 .gitignore
# DVC is clean:
dvc status
# Data and pipelines are up to date.
# dvc add again, which should do nothing IMO:
dvc add -R dvcfiles
# However files are not executable now:
ls -lap dvcfiles
# total 20
# drwxr-xr-x 1 zzz zzz 52 Apr 22 15:11 ./
# drwxr-xr-x 1 zzz zzz 52 Apr 22 15:01 ../
# -rw-r--r-- 1 zzz zzz 4 Apr 22 14:59 f1
# -rw-r--r-- 1 zzz zzz 67 Apr 22 15:11 f1.dvc
# -rw-r--r-- 1 zzz zzz 4 Apr 22 15:00 f2
# -rw-r--r-- 1 zzz zzz 67 Apr 22 15:11 f2.dvc
# -rw-r--r-- 1 zzz zzz 8 Apr 22 15:07 .gitignore
git status
# On branch master
# Changes not staged for commit:
# (use "git add <file>..." to update what will be committed)
# (use "git restore <file>..." to discard changes in working directory)
# modified: dvcfiles/f1.dvc
# modified: dvcfiles/f2.dvc
git diff
# diff --git a/dvcfiles/f1.dvc b/dvcfiles/f1.dvc
# index 169c330..2d76ec6 100644
# --- a/dvcfiles/f1.dvc
# +++ b/dvcfiles/f1.dvc
# @@ -1,5 +1,4 @@
# outs:
# - md5: ba1f2511fc30423bdbb183fe33f3dd0f
# size: 4
# - isexec: true
# path: f1
# diff --git a/dvcfiles/f2.dvc b/dvcfiles/f2.dvc
# index 22c3317..7c42519 100644
# --- a/dvcfiles/f2.dvc
# +++ b/dvcfiles/f2.dvc
# @@ -1,5 +1,4 @@
# outs:
# - md5: e42bb897d0afcdb1f1c46fb5e0c1ad22
# size: 4
# - isexec: true
# path: f2
dvc version
# DVC version: 2.10.1 (pip)
# ---------------------------------
# Platform: Python 3.9.7 on Linux-5.10.0-11-amd64-x86_64-with-glibc2.31
# Supports:
# webhdfs (fsspec = 2022.3.0),
# http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
# https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
# Cache types: reflink, hardlink, symlink
# Cache directory: btrfs on /dev/sda1
# Caches: local
# Remotes: None
# Workspace directory: btrfs on /dev/sda1
# Repo: dvc, git
git --version
# git version 2.30.2
Expected
One of two things:
- either don't touch the exec flag
- or, if there's a good reason for it, clear it on the first
dvc add
Environment information
Debian 11, both btrfs and ext4
Output of dvc doctor
:
$ dvc doctor
DVC version: 2.10.1 (pip)
---------------------------------
Platform: Python 3.9.7 on Linux-5.10.0-11-amd64-x86_64-with-glibc2.31
Supports:
webhdfs (fsspec = 2022.3.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
Cache types: reflink, hardlink, symlink
Cache directory: btrfs on /dev/sda1
Caches: local
Remotes: None
Workspace directory: btrfs on /dev/sda1
Repo: dvc, git
Additional Information (if any):
- Tested on btrfs and on ext4
- I tried to see if this was a consequence of using
dvc add -R
. Note that when adding the entire directory (not using-R
), the exec flag is cleared on the first `dvc add!