Description
Describe the bug
It looks like the nightly builds of the Docker images have started failing due to an error when fetching data for the cuVS benchmarks. This appears to happen in all Docker images
Have taken the snippet below from this GHA log, but similar errors can be seen in the others
> [cuvs-bench-datasets 3/3] RUN /home/rapids/cuvs-bench/get_datasets.sh:
0.214 return self._call_chain(*args)
0.214 ^^^^^^^^^^^^^^^^^^^^^^^
0.214 File "/opt/conda/lib/python3.12/urllib/request.py", line 492, in _call_chain
0.214 result = func(*args)
0.214 ^^^^^^^^^^^
0.214 File "/opt/conda/lib/python3.12/urllib/request.py", line 639, in http_error_default
0.214 raise HTTPError(req.full_url, code, msg, hdrs, fp)
0.214 urllib.error.HTTPError: HTTP Error 403: Forbidden
0.214 downloading http://ann-benchmarks.com/deep-image-96-angular.hdf5 -> /home/rapids/preloaded_datasets/deep-image-96-angular.hdf5...
0.214 Cannot download http://ann-benchmarks.com/deep-image-96-angular.hdf5
Steps/Code to reproduce bug
Run the script cuvs-bench/get_datasets.sh
. It appears to fail on the first dataset (please see below). However the later ones may also have the same issue
Expected behavior
The benchmark datasets are retrieved.
Environment details (please complete the following information):
- Environment location: Docker (on CI)
- Method of cuDF install: Conda in Docker build (reproducible with any image or just the script above)
- If method of install is [Docker], provide
docker pull
&docker run
commands used
- If method of install is [Docker], provide
- Please run and attach the output of the
cudf/print_env.sh
script to gather relevant environment details
Not seeing where cudf/print_env.sh
is run on CI. Where should we be looking? Or should we add this to our CI scripts?
In any event there is a bunch of diagnostic information in the log. Though suspect this is as simple as the URL changing or us needing some additional authentication to get the data
Additional context
Not that I can think of
Activity