Description
Bug Report
Description
I'm using a dvc import
ed asset in a project. In a certain environment, I'm using url.insteadOf
to replace the URL of the repo from which the asset is imported. In my particular case, I'm replacing an SSH url wth a path URL. However, the clone of that remote repo fails right here:
Lines 160 to 162 in 6ace5ed
The first call to Git.clone()
succeeds as the URL is properly replaced. However, in the call to fetch_all_exps
, the value of url
being provided is NOT the replaced one, which is stored in the cloned repo's config as the URL of the remote branch. And so the fetch fails. Also potentially relevant section is in scmrepo.git.backend.dulwich.iter_remote_refs()
:
It's possible this behavior should be handled by the upstream packages (scmrepo
or dulwich
), but I'm starting the discussion here.
Reproduce
dvc import
an asset from any project using SSH URL (or just make a dummy.dvc
file)- Clone that same repo to e.g.
/tmp/remote
get config --global url./tmp/remote.insteadOf ${SSH_URL}
SSH_AUTH_SOCK= dvc pull asset.dvc
(SSH_AUTH_SOCK
here is just an example if you're using ssh-agent. The point is to do this in an env without creds for SSH access.)
Expected
The pull should succeed!
Environment information
Output of dvc doctor
:
$ dvc doctor
DVC version: 2.58.2 (pip)
-------------------------
Platform: Python 3.8.16 on Linux-6.2.6-76060206-generic-x86_64-with-glibc2.2.5
Subprojects:
dvc_data = 0.51.0
dvc_objects = 0.22.0
dvc_render = 0.3.1
dvc_task = 0.2.1
scmrepo = 1.0.2
Supports:
http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
s3 (s3fs = 2023.3.0, boto3 = 1.24.59),
ssh (sshfs = 2023.4.1)
Config:
Global: /home/kernel/.config/dvc
System: /etc/xdg/dvc
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: s3, ssh
Workspace directory: overlay on overlay
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/a6c21da1aef04b4fdc4a48db8508fea3
Additional Information (if any):
The stack trace is very long, but I think I've pointed out the relevant sections above.
Related, but another way to solve my problem would be if one could use remote://${remote_name}
as the repo.url
in .dvc
files, as then the URL could be overridden by DVC configs at the very beginning. Open to anything that solves this problem, including opening a PR myself with the preferred approach :)