Skip to content

(fix): skip X writing when X is None#2054

Merged
flying-sheep merged 10 commits into
mainfrom
skip-x-again
Jul 29, 2025
Merged

(fix): skip X writing when X is None#2054
flying-sheep merged 10 commits into
mainfrom
skip-x-again

Conversation

@flying-sheep
Copy link
Copy Markdown
Member

@flying-sheep flying-sheep commented Jul 28, 2025

The removal of the str stuff is unrelated, I just didn’t want to step over it anymore without need (we don’t support the zarr version anymore that’s mentioned in the comment that addresses the str conversion)

@codecov
Copy link
Copy Markdown

codecov Bot commented Jul 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.60%. Comparing base (40b6ccf) to head (a3b23fb).
⚠️ Report is 47 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2054   +/-   ##
=======================================
  Coverage   87.60%   87.60%           
=======================================
  Files          46       46           
  Lines        7049     7052    +3     
=======================================
+ Hits         6175     6178    +3     
  Misses        874      874           
Files with missing lines Coverage Δ
src/anndata/_io/h5ad.py 93.39% <100.00%> (+0.22%) ⬆️
src/anndata/_io/specs/methods.py 91.07% <100.00%> (+0.01%) ⬆️
src/anndata/_io/zarr.py 82.92% <ø> (-0.99%) ⬇️

@flying-sheep flying-sheep requested a review from ilan-gold July 28, 2025 11:56
@flying-sheep flying-sheep added this to the 0.12.2 milestone Jul 28, 2025
@flying-sheep flying-sheep changed the title (fix): skip X writing if X is None (fix): skip X writing when X is None Jul 28, 2025
Comment thread src/anndata/_io/h5ad.py
if "X" in as_dense and isinstance(x, CSMatrix | BaseCompressedSparseDataset):
write_sparse_as_dense(f, "X", x, dataset_kwargs=dataset_kwargs)
elif x is None:
f.pop("X", None)
Copy link
Copy Markdown
Member Author

@flying-sheep flying-sheep Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding this pop branch make Ruff complain about too high complexity, so I extracted _write_x and _write_raw.

The logic should otherwise be the same for raw, and almost the same for X:

  1. check if X is to be written sparse-to-dense, if yes, do it, else
  2. check if X is backed and the file path matches, if yes we’re done with X, else
  3. (new) if X is None, remove "X" from the HDF5 group if it’s there, else
  4. write X normally to the file

Copy link
Copy Markdown
Contributor

@ilan-gold ilan-gold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to add a bytes-for-bytes test for the underlying file?

@flying-sheep
Copy link
Copy Markdown
Member Author

flying-sheep commented Jul 28, 2025

minimal AnnData files are byte-identical between 0.11.4 and this branch:

$ uvx --with=anndata==0.11.4 python -c 'from anndata import AnnData; adata = AnnData(shape=(10, 20)); adata.write_h5ad("0.11.h5py")'
$ hatch -e hatch-test.stable run python -c 'from anndata import read_h5ad; read_h5ad("0.11.h5py").write_h5ad("0.12.h5py")'
$ md5sum 0.11.h5py 0.12.h5py 
a8cdc9539050b6523f529073636733cc  0.11.h5py
a8cdc9539050b6523f529073636733cc  0.12.h5py

maximalist anndata (with X, so regardless of this PR) isn’t:

$ uvx --with=anndata[test]==0.11.4 --with=numba>=0.61 python -c 'from anndata.tests.helpers import gen_adata; adata = gen_adata((10, 20)); adata.write_h5ad("0.11.h5py")'
$ hatch -e hatch-test.stable run python -c 'from anndata import read_h5ad; read_h5ad("0.11.h5py").write_h5ad("0.12.h5py")'
$ md5sum 0.11.h5py 0.12.h5py 
74cbadb983b65dccd116c2342abdde30  0.11.h5py
a33f2f9244501426ca941a9b3299245e  0.12.h5py

@flying-sheep flying-sheep requested a review from ilan-gold July 28, 2025 15:49
@flying-sheep
Copy link
Copy Markdown
Member Author

The HDF5 diff test works locally, I’m requesting your review now, and if it fails on CI, I’ll of course install it somehow and I’ll make sure it runs before merging.

@flying-sheep flying-sheep enabled auto-merge (squash) July 29, 2025 08:01
@flying-sheep flying-sheep disabled auto-merge July 29, 2025 08:04
@flying-sheep flying-sheep enabled auto-merge (squash) July 29, 2025 08:06
@flying-sheep flying-sheep merged commit b9af951 into main Jul 29, 2025
19 checks passed
@flying-sheep flying-sheep deleted the skip-x-again branch July 29, 2025 08:19
meeseeksmachine pushed a commit to meeseeksmachine/anndata that referenced this pull request Jul 29, 2025
flying-sheep added a commit that referenced this pull request Jul 29, 2025
Co-authored-by: Philipp A <flying-sheep@web.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Writing to H5AD has changed in v0.12.1

2 participants