-
Notifications
You must be signed in to change notification settings - Fork 12
Description
I have a file in my dataset that is coming from the CDS using my datalad-cds extension. This means it has a cds: type URL associated with it that the special remote of that extension uses.
The same file is available from JUDAC and I would like to add that as an additional source via SSH. So I tried out using uncurl for that:
git annex initremote uncurl type=external externaltype=uncurl encryption=none autoenable=true
git annex addurl --file=01/2023010100_sf.grb ssh://judac.fz-juelich.de/p/data1/slmet/met_data/ecmwf/era5/grib/2023/01/2023010100_sf.grb
The issue now is that when I datalad get the file, this happens:
$ pixi run datalad get -s uncurl grib/2023/01/2023010100_sf.grb
get(error): grib/2023/01/2023010100_sf.grb (file) [external special remote error: unsupported URL 'cds:v1-eyJkYXRhc2V0IjoicmVhbmFseXNpcy1lcmE1LWNvbXBsZXRlIiwic3ViLXNlbGVjdGlvbiI6eyJjbGFzcyI6ImVhIiwiZGF0ZSI6IjIwMjMtMDEtMDEiLCJleHB2ZXIiOiIxIiwiZm9ybWF0IjoiZ3JpYiIsImdyaWQiOiIuMy8uMyIsImxldnR5cGUiOiJzZmMiLCJwYXJhbSI6IjMxLjEyOC8zMy4xMjgvMzQuMTI4LzM1LjEyOC8zOS4xMjgvNDAuMTI4LzQxLjEyOC80Mi4xMjgvNTkuMTI4LzY2LjEyOC82Ny4xMjgvMTI5LjEyOC8xMzQuMTI4LzEzOS4xMjgvMTQxLjEyOC8xNTEuMTI4LzE1OS4xMjgvMTY0LjEyOC8xNjUuMTI4LzE2Ni4xMjgvMTY3LjEyOC8xNjguMTI4LzE3MC4xMjgvMTcyLjEyOC8xODMuMTI4LzE4Ni4xMjgvMTg3LjEyOC8xODguMTI4LzE5OC4xMjgvMjMyLjEyOC8yMzUuMTI4LzIzNi4xMjgvMjM4LjEyOCIsInN0cmVhbSI6Im9wZXIiLCJ0aW1lIjoiMDAiLCJ0eXBlIjoiYW4ifX0%3D'
external special remote error: unsupported URL 'cds:v1-eyJkYXRhc2V0IjoicmVhbmFseXNpcy1lcmE1LWNvbXBsZXRlIiwic3ViLXNlbGVjdGlvbiI6eyJjbGFzcyI6ImVhIiwiZGF0ZSI6IjIwMjMtMDEtMDEiLCJleHB2ZXIiOiIxIiwiZm9ybWF0IjoiZ3JpYiIsImdyaWQiOiIuMy8uMyIsImxldnR5cGUiOiJzZmMiLCJwYXJhbSI6IjMxLjEyOC8zMy4xMjgvMzQuMTI4LzM1LjEyOC8zOS4xMjgvNDAuMTI4LzQxLjEyOC80Mi4xMjgvNTkuMTI4LzY2LjEyOC82Ny4xMjgvMTI5LjEyOC8xMzQuMTI4LzEzOS4xMjgvMTQxLjEyOC8xNTEuMTI4LzE1OS4xMjgvMTY0LjEyOC8xNjUuMTI4LzE2Ni4xMjgvMTY3LjEyOC8xNjguMTI4LzE3MC4xMjgvMTcyLjEyOC8xODMuMTI4LzE4Ni4xMjgvMTg3LjEyOC8xODguMTI4LzE5OC4xMjgvMjMyLjEyOC8yMzUuMTI4LzIzNi4xMjgvMjM4LjEyOCIsInN0cmVhbSI6Im9wZXIiLCJ0aW1lIjoiMDAiLCJ0eXBlIjoiYW4ifX0%3D'
external special remote error: unsupported URL 'cds:v1-eyJkYXRhc2V0IjoicmVhbmFseXNpcy1lcmE1LWNvbXBsZXRlIiwic3ViLXNlbGVjdGlvbiI6eyJjbGFzcyI6ImVhIiwiZGF0ZSI6IjIwMjMtMDEtMDEiLCJleHB2ZXIiOiIxIiwiZm9ybWF0IjoiZ3JpYiIsImdyaWQiOiIuMy8uMyIsImxldnR5cGUiOiJzZmMiLCJwYXJhbSI6IjMxLjEyOC8zMy4xMjgvMzQuMTI4LzM1LjEyOC8zOS4xMjgvNDAuMTI4LzQxLjEyOC80Mi4xMjgvNTkuMTI4LzY2LjEyOC82Ny4xMjgvMTI5LjEyOC8xMzQuMTI4LzEzOS4xMjgvMTQxLjEyOC8xNTEuMTI4LzE1OS4xMjgvMTY0LjEyOC8xNjUuMTI4LzE2Ni4xMjgvMTY3LjEyOC8xNjguMTI4LzE3MC4xMjgvMTcyLjEyOC8xODMuMTI4LzE4Ni4xMjgvMTg3LjEyOC8xODguMTI4LzE5OC4xMjgvMjMyLjEyOC8yMzUuMTI4LzIzNi4xMjgvMjM4LjEyOCIsInN0cmVhbSI6Im9wZXIiLCJ0aW1lIjoiMDAiLCJ0eXBlIjoiYW4ifX0%3D']
action summary:
get (error: 1, notneeded: 1)
It seems like the uncurl special remote is not properly skipping URLs it cannot handle or trying the remaining URLs when it encounters one it can't handle.
From skimming the code, it seems like the error originates here:
| raise ValueError(f'unsupported URL {url!r}') |
And the special remote fails to continue with the remaining URLs here:
datalad-next/datalad_next/annexremotes/uncurl.py
Lines 526 to 541 in 25ac6ef
| for url in urls: | |
| try: | |
| handler(url) | |
| # we succeeded, no need to try again | |
| return True | |
| except UrlOperationsResourceUnknown: | |
| # general system access worked, but at the key location is nothing | |
| # to be found | |
| return False | |
| except UrlOperationsRemoteError as e: | |
| # return False only if we could be sure that the remote | |
| # system works properly and just the key is not around | |
| CapturedException(e) | |
| self.message( | |
| f'Failed to {action[0]} key {key!r} {action[1]} {url!r}', | |
| type='debug') |
I think it should additionally catch this exception and continue on with the next URL, until it either finds a working one or has none left to try.