-
Notifications
You must be signed in to change notification settings - Fork 37
Description
Hello,
I am currently profiling I/O operations from a radioastronomical Python tool that uses NumPy and AstroPy. However, I noticed that Darshan does not correctly detect the original size of the I/O performed by these libraries.
I test with two different versions python / numpy :
- NumPy 2.2.6 / 1.22.4
- Python 3.13.7 / 3.9.2
Using :
- Darshan lib_ver = 3.4.7
Exemple
I am writing and reading a matrix of size 1000 x 1000, which corresponds to approximately 8,000,000 bytes in raw data.
-
for .npy :
matrix.npy: 4224 bytes (read=4096, write=128) -
for npz (compressed matrix) :
matrix.npz: 16,013,028 bytes (read=8012699, write=8000329)
I assume it's because of how the matrix is written/read.
Is it an known issue ? Or maybe it used to work on a older version of Darshan.
Thank you in advance.
Command :
DARSHAN_CONFIG_PATH="darshan.conf" \
DARSHAN_ENABLE_NONMPI=1 \
LD_PRELOAD="PATH/TO/libdarshan.so" \
python3 matrix.py <format>
Python code of matrix.py :
import numpy as np
import sys
# Check argument for file type: 'npz' or 'npy'
filetype = sys.argv[1].lower() if len(sys.argv) > 1 else 'npy'
# Create a 1000x1000 random matrix
matrix = np.random.rand(1000, 1000)
if filetype == 'npz':
# Save compressed npz
np.savez_compressed('matrix.npz', matrix=matrix)
print("Matrix saved as 'matrix.npz' (compressed)")
# Load npz
data = np.load('matrix.npz')
loaded = data['matrix']
print(f"Loaded matrix shape: {loaded.shape}")
elif filetype == 'npy':
# Save uncompressed npy
np.save('matrix.npy', matrix)
print("Matrix saved as 'matrix.npy' (uncompressed)")
# Load npy
loaded = np.load('matrix.npy')
print(f"Loaded matrix shape: {loaded.shape}")
else:
print("Error: Unknown filetype argument. Use 'npy' or 'npz'.")
sys.exit(1)
# Check equality
print("Equal?", np.array_equal(matrix, loaded))darshan.conf :
MODMEM 64
NAMEMEM 128
MAX_RECORDS 65536 POSIX,STDIO,MPI-IO
NAME_EXCLUDE \.py,\.pyc.,\.pth *
NAME_INCLUDE \.npy,PATH/TO/DIREC/darshan_npy POSIX,STDIO,MPI-IO