Description
Bug Report
Issue name
import-url: does not set up and use the shared cache configured for the project.
Description
Importing an Azure file or folder with import-url
will only use the shared cache configured for the project the first time it is used IF the --no-download
flag is not present.
If the importing is done avoiding downloads (with --no-download
flag) or the files downloaded and linked initially are lost (totally or partially removed) the subsequent attempts to download the files with dvc pull
will not use the shared cache configured and just copy the files locally.
Reproduce
dvc import-url --no-download --version-aware azure://azureTest/fileTest
dvc pull fileTest.dvc
Also
dvc import-url --version-aware azure://azureTest/fileTest
- Remove the downloaded file.
dvc pull fileTest.dvc
Expected
Files go to the cache location and a symlink is created in project folder like this:
4 lrwxrwxrwx 1 user user 86 Oct 1 09:57 tinytestvideo.mp4 -> /mnt/samba/Server/Project/DVC_CACHE/files/md5/95/1ea15426585e424e9f9dfd6e1e76d3
However, this only happens if the --no-download
is not present AND just the first time the file/folder is imported.
If the --no-download
flag is present or the file is downloaded with dvc pull
after the initial import the file is copied to the local project folder and not the shared cache.
Environment information
Ubuntu 22.04.4 LTS server
User personal folder, with a git and dvc project downloaded to their profile and cache configured as follows:
[cache]
dir = /mnt/samba/Server/Project/DVC_CACHE/
shared = group
type = "symlink,hardlink"
where the dir is a network folder mounted with Samba.
Output of dvc doctor
:
DVC version: 3.53.1 (pip)
Platform: Python 3.11.7 on Linux-6.8.0-40-generic-x86_64-with-glibc2.35
Subprojects:
dvc_data = 3.15.2
dvc_objects = 5.1.0
dvc_render = 1.0.2
dvc_task = 0.4.0
scmrepo = 3.3.7
Supports:
azure (adlfs = 2024.7.0, knack = 0.12.0, azure-identity = 1.17.1),
http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3)
Config:
Global: /home/USER REDACTED/.config/dvc
System: /etc/xdg/dvc
Cache types: symlink
Cache directory: cifs on REDACTED
Caches: local
Remotes: azure, azure
Workspace directory: ext4 on /dev/nvme0n1p2
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/8d7d2888c4cd9c330c3311ea57232c70
Additional Information (if any):