Skip to content

Darshan not detecting I/O from NumPy / Astropy #1072

@remyyyyh

Description

@remyyyyh

Hello,
I am currently profiling I/O operations from a radioastronomical Python tool that uses NumPy and AstroPy. However, I noticed that Darshan does not correctly detect the original size of the I/O performed by these libraries.
I test with two different versions python / numpy :

  • NumPy 2.2.6 / 1.22.4
  • Python 3.13.7 / 3.9.2

Using :

  • Darshan lib_ver = 3.4.7

Exemple

I am writing and reading a matrix of size 1000 x 1000, which corresponds to approximately 8,000,000 bytes in raw data.

  • for .npy :
    matrix.npy: 4224 bytes (read=4096, write=128)

  • for npz (compressed matrix) :
    matrix.npz: 16,013,028 bytes (read=8012699, write=8000329)

I assume it's because of how the matrix is written/read.
Is it an known issue ? Or maybe it used to work on a older version of Darshan.

Thank you in advance.

Command :

DARSHAN_CONFIG_PATH="darshan.conf" \
DARSHAN_ENABLE_NONMPI=1  \
LD_PRELOAD="PATH/TO/libdarshan.so" \
 python3 matrix.py <format> 

Python code of matrix.py :

import numpy as np
import sys

# Check argument for file type: 'npz' or 'npy'
filetype = sys.argv[1].lower() if len(sys.argv) > 1 else 'npy'

# Create a 1000x1000 random matrix
matrix = np.random.rand(1000, 1000)

if filetype == 'npz':
    # Save compressed npz
    np.savez_compressed('matrix.npz', matrix=matrix)
    print("Matrix saved as 'matrix.npz' (compressed)")

    # Load npz
    data = np.load('matrix.npz')
    loaded = data['matrix']
    print(f"Loaded matrix shape: {loaded.shape}")

elif filetype == 'npy':
    # Save uncompressed npy
    np.save('matrix.npy', matrix)
    print("Matrix saved as 'matrix.npy' (uncompressed)")

    # Load npy
    loaded = np.load('matrix.npy')
    print(f"Loaded matrix shape: {loaded.shape}")

else:
    print("Error: Unknown filetype argument. Use 'npy' or 'npz'.")
    sys.exit(1)

# Check equality
print("Equal?", np.array_equal(matrix, loaded))

darshan.conf :

MODMEM 64
NAMEMEM 128
MAX_RECORDS 65536 POSIX,STDIO,MPI-IO

NAME_EXCLUDE \.py,\.pyc.,\.pth * 
NAME_INCLUDE \.npy,PATH/TO/DIREC/darshan_npy POSIX,STDIO,MPI-IO 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions