-
Notifications
You must be signed in to change notification settings - Fork 115
Description
Description
Hi everyone, following YODA principles, I regularly run into the following "issue": To keep datasets modular, I usually add them as subdatasets in an inputs directory while they also exist at the project-directory level. Here is an example: I have a BIDS DataLad dataset (bids) that I add as a subdataset to my fmriprep DataLad dataset:
myproject
.
├── fmriprep
│ ├── code
│ └── inputs
│ └── bids (5992f12) # same DataLad dataset as below
└── bids (5992f12)Now when I run fMRIprep, I give it ./fmriprep/inputs/bids as the input path. But this involves running datalad get to actually get the files of the BIDS dataset into that place. To speed this up, I usually configure a local DataLad sibling for ./fmriprep/inputs/bids like this datalad siblings add -s local --url ../../../bids. Then datalad get can retrieve the data from local. But then I have the full size of the BIDS dataset in two locations which takes up additional disk space. Of course, I could datalad drop the files again but, and here comes the idea, maybe there is a way to adjust the path such that the data does not have to be retrieved and copied again, while still staying in line with YODA principles.
I am not even sure if this is something that can or / should be handled on the DataLad side but maybe you know other nice workarounds for this? Thanks!