core driver performance prohibitively slow on Windows

**Describe the bug**
We attempted to use the 'core' driver on Windows (in-memory), and quickly discovered that it was about 300x slower on Windows than on MacOS for a 22 GB file. The default driver works fine.

We use HDF5 to store a hierarchy of data points. When building the object, we stream from a binary, do some preprocessing, and then populate the HDF storage with more than a 100 data nodes, each containing about two dozen complex-valued numpy arrays.

**Expected behavior**
The use of the in-memory driver should be faster than the default (on-disk) one and the performance should be comparable between different OS.

**Platform (please complete the following information)**
 - HDF5 version: 1.14.1
 - OS and version: Windows 11 Pro
 - Compiler and version: installed from the official binary, hdf5-1.14.1-2-Std-win10_64-vs17.zip
 - Build system (e.g. CMake, Autotools): Visual Studio 17.6.5
 - Any configure options you specified: Nothing special

**Additional context**
This issue was originally reported on the h5py GitHub in 2021 here: https://github.com/h5py/h5py/issues/1827 , but I guess no one raised it with your group.

The problem seems to be that you rely on a `realloc()` call to get more memory regardless of the OS, and on Windows this often results in the dataset being copied on each additional memory request. In our situation, where we need to request thousands of extensions, hundreds larger than 1MB (default block_size), the storage creation slows down hundreds of times compared to Linux/MacOS because of the copy overhead (also, RAM usage spikes to 2x the size of the dataset on each copy).

[Here](https://blog.kowalczyk.info/article/2be/realloc-on-windows-vs.linux.html) is an explanation of the realloc differences on Windows and Linux.

You want to look [here](https://github.com/HDFGroup/hdf5/blob/85d216634aae94223ed5b75a77585f9fd60009e5/src/H5FDcore.c#L1364) in your code. What people typically do on Windows is request 1.3x the new dataset size. This requires 30% more RAM on Windows, but the performance gets much closer to `O(log(n))` the way it should be, instead of a copy on each increment.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

core driver performance prohibitively slow on Windows #3275

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

core driver performance prohibitively slow on Windows #3275

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions