Skip to content

Performance regression for lz4 decompression after version 4.0.1 #326

Open
@Dalbasar

Description

@Dalbasar

I noticed that the LZ4 decompression via hdf5plugin 4.1.0 and later is 5-6x slower than with hdfplugin 4.0.1, while the compression speed is very similar:

import time
import hdf5plugin
import h5py
import numpy as np
from io import BytesIO

test_data = np.ones((1024, 1024, 1024), np.uint8)

raw_buffer = BytesIO()

with h5py.File(raw_buffer, 'w') as f:
    compression_start_time = time.perf_counter()
    f.create_dataset('data', data=test_data, compression=hdf5plugin.LZ4())
    compression_time = time.perf_counter() - compression_start_time

with h5py.File(raw_buffer, 'r') as f:
    decompression_start_time = time.perf_counter()
    data = f['data'][:]
    decompression_time = time.perf_counter() - decompression_start_time

print(f"hdf5plugin {hdf5plugin.version}: "
      f"lz4 compression time {compression_time:.3f}s, "
      f"lz4 decompression_time: {decompression_time:.3f}s")

gives the following results for different hdf5plugin version with h5py 3.12.1 on Python 3.11.9 on Windows 10 (AMD Ryzen 7 5900X):

hdf5plugin 4.0.1: lz4 compression time 0.219s, lz4 decompression_time: 0.283s
hdf5plugin 4.1.0: lz4 compression time 0.226s, lz4 decompression_time: 1.630s
hdf5plugin 5.0.0: lz4 compression time 0.221s, lz4 decompression_time: 1.610s

I have seen similar results on Python 3.8 and 3.11 on Debian 12 with different h5py versions.

I would have expected a substantial speedup after updating to version 5.0 with update to lz4 1.10 with the new multithreaded decompression compared to 4.1.x, but the decompression speed for 4.1.x and 5.0 seems to be the same and not using multithreaded lz4 decompression.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions