Add download benchmarks using DANDI api #169

stephprince · 2025-10-22T19:42:38Z

Adds benchmarks to get download times for HDF5 and Zarr NWB files to compare against slicing extrapolations. To be used when deciding whether to download vs. stream an NWB file depending on the amount of data being accessed.

We had discussed making these either in a separate benchmarks folder or adding an environment variable. I thought setting the environment variable with the skip decorator was cleaner since there are other download related benchmarks and only two full file download tests. The download benchmarks can then be run manually with:

RUN_DOWNLOAD_BENCHMARKS=true nwb_benchmarks run --bench "time_download.HDF5DownloadDandiAPIBenchmark.time_download_hdf5_dandi_api"

for more information, see https://pre-commit.ci

stephprince · 2025-10-22T20:10:04Z

Looking back at our initial figure sketches, should these tests also include time to open + slice data after downloading? Or is that timing negligible at that point.

Edit: This would be covered by #163 so probably do not need to also include here.

CodyCBakerPhD · 2025-10-22T20:13:17Z

Looking back at our initial figure sketches, should these tests also include time to open + slice data after downloading? Or is that timing negligible at that point

If the benchmarks can re-use the already downloaded file, then timing the file open and slicing would be fair, as per #163 can be added in this PR or in a follow up

CodyCBakerPhD · 2025-10-22T20:14:10Z

(both of these are really more of a 'calibration' of average system bandwidth + disk)

oruebel · 2025-10-22T22:39:45Z

Looking back at our initial figure sketches, should these tests also include time to open + slice data after downloading? Or is that timing negligible at that point

If the benchmarks can re-use the already downloaded file, then timing the file open and slicing would be fair, as per #163 can be added in this PR or in a follow up

Agreed. Timing slicing from local files can be separate and use already downloaded files.

rly · 2025-10-23T21:08:47Z

src/nwb_benchmarks/benchmarks/time_download.py

+        download(urls=params["https_url"], output_dir=self.tmpdir.name)
+
+
+class LindiDownloadFsspecBenchmark(BaseBenchmark):


Suggested change

class LindiDownloadFsspecBenchmark(BaseBenchmark):

class LindiDownloadBenchmark(BaseBenchmark):

Should this be just LindiDownloadBenchmark?

Oh, I see, it uses download_file which uses fsspec

For consistency and maintainability, I think we should just use the DANDI API to download the LINDI files. I'll also update the download_file usage in a separate PR

Users would most likely either be using the dandi api/cli or their browser to download these files, not fsspec anyway

That makes sense to keep consistent - I will update here to use the dandi API across all file types

Updated to use dandi api here but left the download_file definition as it was for now.

The lindi files don't take very long to download, does it make sense to remove the skip decorator in this case or is that more confusing/inconsistent? We discussed adding these lindi download times in our benchmark plots which is another reason to not skip

The lindi files don't take very long to download, does it make sense to remove the skip decorator in this case

For lindi I think it makes sense to always run the download test

for more information, see https://pre-commit.ci

rly · 2025-10-24T05:58:50Z

Looks good!

stephprince and others added 5 commits October 22, 2025 12:31

add download benchmarks

d1c6716

add env variable for running download benchmarks

f0144a3

add optional redirect arg to get_https_url

ca4fa2e

Merge branch 'main' into add-download-benchmarks

dc8013c

[pre-commit.ci] auto fixes from pre-commit.com hooks

71b0d28

for more information, see https://pre-commit.ci

stephprince requested a review from rly October 22, 2025 19:54

stephprince marked this pull request as ready for review October 22, 2025 19:55

rly mentioned this pull request Oct 23, 2025

Use new ophys files #170

Merged

rly reviewed Oct 23, 2025

View reviewed changes

stephprince and others added 4 commits October 23, 2025 16:26

use dandi for lindi download benchmark

f193fd1

[pre-commit.ci] auto fixes from pre-commit.com hooks

0d8c519

for more information, see https://pre-commit.ci

remove skip from lindi download

52ea736

Merge branch 'main' into add-download-benchmarks

4268408

rly approved these changes Oct 24, 2025

View reviewed changes

CodyCBakerPhD merged commit 7fc6911 into main Oct 24, 2025
3 checks passed

CodyCBakerPhD deleted the add-download-benchmarks branch October 24, 2025 15:16

CodyCBakerPhD assigned stephprince Oct 24, 2025

		download(urls=params["https_url"], output_dir=self.tmpdir.name)


		class LindiDownloadFsspecBenchmark(BaseBenchmark):

	class LindiDownloadFsspecBenchmark(BaseBenchmark):
	class LindiDownloadBenchmark(BaseBenchmark):

Add download benchmarks using DANDI api #169

Add download benchmarks using DANDI api #169

Uh oh!

Conversation

stephprince commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephprince commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CodyCBakerPhD commented Oct 22, 2025

Uh oh!

CodyCBakerPhD commented Oct 22, 2025

Uh oh!

oruebel commented Oct 22, 2025

Uh oh!

rly Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

rly Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

rly Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

rly Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

stephprince Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

stephprince Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oruebel Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

rly commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

stephprince commented Oct 22, 2025 •

edited

Loading

stephprince commented Oct 22, 2025 •

edited

Loading

stephprince Oct 23, 2025 •

edited

Loading