Skip to content

Reading a hyperslab of a dataset with an unlimited dimension increases the execution time in successive runs #4513

Open
@abhibaruah

Description

@abhibaruah

Seen only in Windows

This issue is similar to the one I reported in #4481

I have a dataset of size 1999 x 512 x 512, with the first dimension being unlimited. It is compressed using deflate compression (level 6) and has chunk size of 20 x 10 x 10.

I open the dataset and read a hyperslab of the dataset (start = {0,0,0} and stride = {2,2,2}). Then I close all identifiers.

I have noticed that the execution time of the hyperslab read increases in successive iterations.
This happens only on Windows, and I could not reproduce the issue in Linux.

HDF5 version (if building from a maintenance branch, please include the commit hash) : HDF5 1.10.11
OS and version : Windows 11

The H5 file I used for the reproduction (unlimDim_smallChunkSize_deflate_2.h5) can be found here:
https://mathworks-my.sharepoint.com/:f:/p/abaruah/Ekr_sEmqx1hFlgPM4tWx55UBS7bzUtxAmHPM33bT6t51CQ?e=ChoCCJ

#include "W:\3rdparty\R2024b\11059386\win64\hdf5\include\hdf5.h"
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <chrono>
#include <array>
#include <random>

#define FILE1 "unlimDim_smallChunkSize_deflate_2.h5"
#define DATASET1        "/lrg_unl_dset2_double_dset"
#define DIM10            1000
#define DIM11            256
#define DIM12            256




void hypersladDsetRead()
{
    hid_t   file, dataset;    /* Handles */
    herr_t  status;
	hid_t   dataspace;   
    hid_t   memspace;
	hsize_t start[3] = {0,0,0};
	hsize_t count[3];
	hsize_t stride[3] = {2,2,2};
    

    double* dset_read = new double[DIM10 * DIM11 * DIM12];
    /*
     * Open the file and the dataset.
     */
    file = H5Fopen(FILE1, H5F_ACC_RDONLY, H5P_DEFAULT);
    dataset = H5Dopen(file, DATASET1, H5P_DEFAULT);
	
	dataspace = H5Dget_space(dataset);    /* dataspace handle */
	
	/* 
     * Define hyperslab in the dataset. 
     */
    count[0]  = DIM10;
    count[1]  = DIM11;
	count[2]  = DIM12;
	
	
    status = H5Sselect_hyperslab(dataspace, H5S_SELECT_SET, start, stride, count, NULL);
	memspace = H5Screate_simple(3,count,count); 
	
	std::chrono::time_point<std::chrono::high_resolution_clock> starttime, end;

    starttime = std::chrono::high_resolution_clock::now();
    /*
     * Write the data to the dataset.
     */
    /*
     * Read data from hyperslab in the file into the hyperslab in 
     * memory and display.
     */
    status = H5Dread(dataset, H5T_NATIVE_FLOAT, memspace, dataspace,
		     H5P_DEFAULT, dset_read);

    end = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> duration = end - starttime;
    double durationInSeconds = duration.count();
	
    std::cout << " hypersladDsetRead Execution time: " << durationInSeconds << " seconds" << std::endl;
	std::cout << "Hyperslab read status: " << status <<std::endl;
	
    /*
     * Close and release resources.
     */
	 
    status = H5Sclose(dataspace);
	status = H5Sclose(memspace);
    status = H5Dclose(dataset);
    status = H5Fclose(file);

    delete [] dset_read;

}

int main() {
    std::cout << "In main" << std::endl;
	
	for (int i = 0; i<10; i++)
	{
		hypersladDsetRead();
	}
	
	return 0;
}

Timing output based on the repro code above:

In main
 hypersladDsetRead Execution time: 146.967 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 150.314 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 150.587 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 158.415 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 173.192 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 173.374 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 184.749 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 191.997 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 188.391 seconds
Hyperslab read status: 0
 hypersladDsetRead Execution time: 199.988 seconds
Hyperslab read status: 0

Metadata

Metadata

Assignees

Labels

Component - C LibraryCore C library issues (usually in the src directory)HDFG-internalInternally coded for use by the HDF GroupPriority - 1. HighThese are important issues that should be resolved in the next release

Type

Projects

Status

Scheduled/On-Deck

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions