Description
Describe the bug
We attempted to use the 'core' driver on Windows (in-memory), and quickly discovered that it was about 300x slower on Windows than on MacOS for a 22 GB file. The default driver works fine.
We use HDF5 to store a hierarchy of data points. When building the object, we stream from a binary, do some preprocessing, and then populate the HDF storage with more than a 100 data nodes, each containing about two dozen complex-valued numpy arrays.
Expected behavior
The use of the in-memory driver should be faster than the default (on-disk) one and the performance should be comparable between different OS.
Platform (please complete the following information)
- HDF5 version: 1.14.1
- OS and version: Windows 11 Pro
- Compiler and version: installed from the official binary, hdf5-1.14.1-2-Std-win10_64-vs17.zip
- Build system (e.g. CMake, Autotools): Visual Studio 17.6.5
- Any configure options you specified: Nothing special
Additional context
This issue was originally reported on the h5py GitHub in 2021 here: h5py/h5py#1827 , but I guess no one raised it with your group.
The problem seems to be that you rely on a realloc()
call to get more memory regardless of the OS, and on Windows this often results in the dataset being copied on each additional memory request. In our situation, where we need to request thousands of extensions, hundreds larger than 1MB (default block_size), the storage creation slows down hundreds of times compared to Linux/MacOS because of the copy overhead (also, RAM usage spikes to 2x the size of the dataset on each copy).
Here is an explanation of the realloc differences on Windows and Linux.
You want to look here in your code. What people typically do on Windows is request 1.3x the new dataset size. This requires 30% more RAM on Windows, but the performance gets much closer to O(log(n))
the way it should be, instead of a copy on each increment.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status