Skip to content

add: Repeated dvc add clears executable flag from files (cache.type = copy) #7619

Open
@meowcat

Description

@meowcat

Bug Report

Description

With cache.type = copy, executing dvc add -R twice clears the executable flags from the added files. The first time, isexec is correctly added to the corresponding .dvc file; the second time, the files lose their executable flag and isexec is removed from the .dvc file.

Reproduce

#!/bin/bash

# create a repo with two files
mkdir dvc-reprex
cd dvc-reprex
mkdir dvcfiles
cd dvcfiles
echo 123 > f1
echo 234 > f2
cd ..
# Set the two files to executable
chmod a+x dvcfiles/*
# Verify
ls -lap dvcfiles

# Init git and dvc repo with cache.type copy
git init
dvc init
dvc config cache.type copy
git add .dvc/config 
git commit -m "setup dvc"

# add the dvcfiles dir
dvc add -R dvcfiles
git add dvcfiles
git commit -m "committed dvcfiles"

# Check that files are still executable:
ls -lap dvcfiles
# All is fine:
# total 20
# drwxr-xr-x 1 zzz zzz 52 Apr 22 15:11 ./
# drwxr-xr-x 1 zzz zzz 52 Apr 22 15:01 ../
# -rwxr--r-- 1 zzz zzz  4 Apr 22 14:59 f1
# -rw-r--r-- 1 zzz zzz 82 Apr 22 15:07 f1.dvc
# -rwxr--r-- 1 zzz zzz  4 Apr 22 15:00 f2
# -rw-r--r-- 1 zzz zzz 82 Apr 22 15:07 f2.dvc
# -rw-r--r-- 1 zzz zzz  8 Apr 22 15:07 .gitignore

# DVC is clean:
dvc status
# Data and pipelines are up to date.

# dvc add again, which should do nothing IMO:
dvc add -R dvcfiles

# However files are not executable now:
ls -lap dvcfiles
# total 20
# drwxr-xr-x 1 zzz zzz 52 Apr 22 15:11 ./
# drwxr-xr-x 1 zzz zzz 52 Apr 22 15:01 ../
# -rw-r--r-- 1 zzz zzz  4 Apr 22 14:59 f1
# -rw-r--r-- 1 zzz zzz 67 Apr 22 15:11 f1.dvc
# -rw-r--r-- 1 zzz zzz  4 Apr 22 15:00 f2
# -rw-r--r-- 1 zzz zzz 67 Apr 22 15:11 f2.dvc
# -rw-r--r-- 1 zzz zzz  8 Apr 22 15:07 .gitignore

git status
# On branch master
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git restore <file>..." to discard changes in working directory)
#         modified:   dvcfiles/f1.dvc
#         modified:   dvcfiles/f2.dvc
git diff
# diff --git a/dvcfiles/f1.dvc b/dvcfiles/f1.dvc
# index 169c330..2d76ec6 100644
# --- a/dvcfiles/f1.dvc
# +++ b/dvcfiles/f1.dvc
# @@ -1,5 +1,4 @@
#  outs:
#  - md5: ba1f2511fc30423bdbb183fe33f3dd0f
#    size: 4
# -  isexec: true
#    path: f1
# diff --git a/dvcfiles/f2.dvc b/dvcfiles/f2.dvc
# index 22c3317..7c42519 100644
# --- a/dvcfiles/f2.dvc
# +++ b/dvcfiles/f2.dvc
# @@ -1,5 +1,4 @@
#  outs:
#  - md5: e42bb897d0afcdb1f1c46fb5e0c1ad22
#    size: 4
# -  isexec: true
#    path: f2


dvc version
# DVC version: 2.10.1 (pip)
# ---------------------------------
# Platform: Python 3.9.7 on Linux-5.10.0-11-amd64-x86_64-with-glibc2.31
# Supports:
#         webhdfs (fsspec = 2022.3.0),
#         http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
#         https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
# Cache types: reflink, hardlink, symlink
# Cache directory: btrfs on /dev/sda1
# Caches: local
# Remotes: None
# Workspace directory: btrfs on /dev/sda1
# Repo: dvc, git

git --version
# git version 2.30.2

Expected

One of two things:

  • either don't touch the exec flag
  • or, if there's a good reason for it, clear it on the first dvc add

Environment information

Debian 11, both btrfs and ext4

Output of dvc doctor:

$ dvc doctor
DVC version: 2.10.1 (pip)
---------------------------------
Platform: Python 3.9.7 on Linux-5.10.0-11-amd64-x86_64-with-glibc2.31
Supports:
        webhdfs (fsspec = 2022.3.0),
        http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6)
Cache types: reflink, hardlink, symlink
Cache directory: btrfs on /dev/sda1
Caches: local
Remotes: None
Workspace directory: btrfs on /dev/sda1
Repo: dvc, git

Additional Information (if any):

  • Tested on btrfs and on ext4
  • I tried to see if this was a consequence of using dvc add -R. Note that when adding the entire directory (not using -R), the exec flag is cleared on the first `dvc add!

Metadata

Metadata

Assignees

No one assigned

    Labels

    A: data-managementRelated to dvc add/checkout/commit/move/removebugDid we break something?regressionOhh, we broke something :-(

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions