Skip to content

uncurl special remote fails when a key has a URL from a different special remote that it does not understand #770

@matrss

Description

@matrss

I have a file in my dataset that is coming from the CDS using my datalad-cds extension. This means it has a cds: type URL associated with it that the special remote of that extension uses.

The same file is available from JUDAC and I would like to add that as an additional source via SSH. So I tried out using uncurl for that:

git annex initremote uncurl type=external externaltype=uncurl encryption=none autoenable=true
git annex addurl --file=01/2023010100_sf.grb ssh://judac.fz-juelich.de/p/data1/slmet/met_data/ecmwf/era5/grib/2023/01/2023010100_sf.grb

The issue now is that when I datalad get the file, this happens:

$ pixi run datalad get -s uncurl grib/2023/01/2023010100_sf.grb
get(error): grib/2023/01/2023010100_sf.grb (file) [external special remote error: unsupported URL 'cds:v1-eyJkYXRhc2V0IjoicmVhbmFseXNpcy1lcmE1LWNvbXBsZXRlIiwic3ViLXNlbGVjdGlvbiI6eyJjbGFzcyI6ImVhIiwiZGF0ZSI6IjIwMjMtMDEtMDEiLCJleHB2ZXIiOiIxIiwiZm9ybWF0IjoiZ3JpYiIsImdyaWQiOiIuMy8uMyIsImxldnR5cGUiOiJzZmMiLCJwYXJhbSI6IjMxLjEyOC8zMy4xMjgvMzQuMTI4LzM1LjEyOC8zOS4xMjgvNDAuMTI4LzQxLjEyOC80Mi4xMjgvNTkuMTI4LzY2LjEyOC82Ny4xMjgvMTI5LjEyOC8xMzQuMTI4LzEzOS4xMjgvMTQxLjEyOC8xNTEuMTI4LzE1OS4xMjgvMTY0LjEyOC8xNjUuMTI4LzE2Ni4xMjgvMTY3LjEyOC8xNjguMTI4LzE3MC4xMjgvMTcyLjEyOC8xODMuMTI4LzE4Ni4xMjgvMTg3LjEyOC8xODguMTI4LzE5OC4xMjgvMjMyLjEyOC8yMzUuMTI4LzIzNi4xMjgvMjM4LjEyOCIsInN0cmVhbSI6Im9wZXIiLCJ0aW1lIjoiMDAiLCJ0eXBlIjoiYW4ifX0%3D'
external special remote error: unsupported URL 'cds:v1-eyJkYXRhc2V0IjoicmVhbmFseXNpcy1lcmE1LWNvbXBsZXRlIiwic3ViLXNlbGVjdGlvbiI6eyJjbGFzcyI6ImVhIiwiZGF0ZSI6IjIwMjMtMDEtMDEiLCJleHB2ZXIiOiIxIiwiZm9ybWF0IjoiZ3JpYiIsImdyaWQiOiIuMy8uMyIsImxldnR5cGUiOiJzZmMiLCJwYXJhbSI6IjMxLjEyOC8zMy4xMjgvMzQuMTI4LzM1LjEyOC8zOS4xMjgvNDAuMTI4LzQxLjEyOC80Mi4xMjgvNTkuMTI4LzY2LjEyOC82Ny4xMjgvMTI5LjEyOC8xMzQuMTI4LzEzOS4xMjgvMTQxLjEyOC8xNTEuMTI4LzE1OS4xMjgvMTY0LjEyOC8xNjUuMTI4LzE2Ni4xMjgvMTY3LjEyOC8xNjguMTI4LzE3MC4xMjgvMTcyLjEyOC8xODMuMTI4LzE4Ni4xMjgvMTg3LjEyOC8xODguMTI4LzE5OC4xMjgvMjMyLjEyOC8yMzUuMTI4LzIzNi4xMjgvMjM4LjEyOCIsInN0cmVhbSI6Im9wZXIiLCJ0aW1lIjoiMDAiLCJ0eXBlIjoiYW4ifX0%3D'
external special remote error: unsupported URL 'cds:v1-eyJkYXRhc2V0IjoicmVhbmFseXNpcy1lcmE1LWNvbXBsZXRlIiwic3ViLXNlbGVjdGlvbiI6eyJjbGFzcyI6ImVhIiwiZGF0ZSI6IjIwMjMtMDEtMDEiLCJleHB2ZXIiOiIxIiwiZm9ybWF0IjoiZ3JpYiIsImdyaWQiOiIuMy8uMyIsImxldnR5cGUiOiJzZmMiLCJwYXJhbSI6IjMxLjEyOC8zMy4xMjgvMzQuMTI4LzM1LjEyOC8zOS4xMjgvNDAuMTI4LzQxLjEyOC80Mi4xMjgvNTkuMTI4LzY2LjEyOC82Ny4xMjgvMTI5LjEyOC8xMzQuMTI4LzEzOS4xMjgvMTQxLjEyOC8xNTEuMTI4LzE1OS4xMjgvMTY0LjEyOC8xNjUuMTI4LzE2Ni4xMjgvMTY3LjEyOC8xNjguMTI4LzE3MC4xMjgvMTcyLjEyOC8xODMuMTI4LzE4Ni4xMjgvMTg3LjEyOC8xODguMTI4LzE5OC4xMjgvMjMyLjEyOC8yMzUuMTI4LzIzNi4xMjgvMjM4LjEyOCIsInN0cmVhbSI6Im9wZXIiLCJ0aW1lIjoiMDAiLCJ0eXBlIjoiYW4ifX0%3D']
action summary:
  get (error: 1, notneeded: 1)

It seems like the uncurl special remote is not properly skipping URLs it cannot handle or trying the remaining URLs when it encounters one it can't handle.

From skimming the code, it seems like the error originates here:

raise ValueError(f'unsupported URL {url!r}')

And the special remote fails to continue with the remaining URLs here:

for url in urls:
try:
handler(url)
# we succeeded, no need to try again
return True
except UrlOperationsResourceUnknown:
# general system access worked, but at the key location is nothing
# to be found
return False
except UrlOperationsRemoteError as e:
# return False only if we could be sure that the remote
# system works properly and just the key is not around
CapturedException(e)
self.message(
f'Failed to {action[0]} key {key!r} {action[1]} {url!r}',
type='debug')

I think it should additionally catch this exception and continue on with the next URL, until it either finds a working one or has none left to try.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions