Skip to content

Hypotheses: The purpose of NWB Benchmarks #89

@CodyCBakerPhD

Description

@CodyCBakerPhD

This is a compiled list of high-level questions the NWB Benchmarks seeks to help answer.

We seek to compile a set of guidelines that provide recommendations to the community to ease the challenge of interacting with large, complex datasets.

When to download vs. stream?

Per data modality (ecephys, icephys, ophys).

In benchmarks, use multiple random samples taken from the wild to test on.

True underlying space is based on a function of file size, number of objects in file, and repeated access patterns applied to the file.

Methods include comparising Zarr streaming on a Zarr-exported NWB file, but are not answering the general question of Zarr vs. NWB.

Which library should I use to stream my data?

Also per modality.

Import metrics to track are speed, network traffic (minimize repeated requests). Filling of disk space can be a concern for certain cache styles, e.g., #62.

What parameterizations are most valuable?

How does performance change across versions/time?

Regression testing for library

ROS3 driver has changed in HDF5 library over time.

Plot speed vs. version of h5py/hdf5.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions