Skip to content

Add dataflow/move_anndata_slots component#1163

Open
jakubmajercik wants to merge 6 commits into
mainfrom
add-move-anndata-slots-component
Open

Add dataflow/move_anndata_slots component#1163
jakubmajercik wants to merge 6 commits into
mainfrom
add-move-anndata-slots-component

Conversation

@jakubmajercik

@jakubmajercik jakubmajercik commented Apr 14, 2026

Copy link
Copy Markdown
Collaborator

Changelog

Add a new dataflow/move_anndata_slots component that selectively moves AnnData slots (.obs, .var, .obsm, .varm, .obsp, .varp, .uns) from a modality in a source MuData file into a modality in a target MuData file, without performing a full merge. Supports cross-modality transfers via --target_modality.

Issue ticket number and link

N/A

Checklist before requesting a review

  • I have performed a self-review of my code

  • Conforms to the Contributor's guide

  • Check the correct box. Does this PR contain:

    • Breaking changes
    • New functionality
    • Major changes
    • Minor changes
    • Documentation
    • Bug fixes
  • Proposed changes are described in the CHANGELOG.md

  • CI tests succeed!

@jakubmajercik jakubmajercik marked this pull request as ready for review April 14, 2026 14:39
Comment thread src/dataflow/move_anndata_slots/test.py Outdated
Comment thread src/dataflow/move_anndata_slots/test.py
Comment thread src/dataflow/move_anndata_slots/script.py Outdated
Comment thread src/dataflow/move_anndata_slots/config.vsh.yaml Outdated
Comment thread src/dataflow/move_anndata_slots/script.py Outdated
Comment thread src/dataflow/move_anndata_slots/config.vsh.yaml Outdated
@jakubmajercik jakubmajercik requested a review from dorien-er April 15, 2026 13:12
@@ -0,0 +1,141 @@
name: move_anndata_slots
namespace: "dataflow"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use metadata as the namespace.

Move slots (.obs, .var, .obsm, .varm, .obsp, .varp, .uns) from a modality
in a source MuData file into a modality in a target MuData file.
The specified slots are copied from the source modality into the target
modality, overwriting any existing data at those slots.

@DriesSchaumont DriesSchaumont Apr 28, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the dimensions not match? (Please add it in the description if they need to match)

needs_var = any(par[s] for s in ("var", "varm", "varp"))

mismatches = []
if needs_obs and set(source_mod.obs_names) != set(target_mod.obs_names):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its best to use https://pandas.pydata.org/docs/reference/api/pandas.Index.get_indexer.html#pandas.Index.get_indexer here. It will return -1 if no match is found.

This will use the same method as reindexing (which is used later in this script). I will take into account dtypes etc..

Comment on lines +69 to +72
"Index mismatch between source and target modalities: "
+ " and ".join(mismatches)
+ " indices do not match."
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use string formatting here.

)

# Reindex source to match target order if needed.
if needs_obs and not (source_mod.obs_names == target_mod.obs_names).all():

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem ditto

run_component(
[
"--input_source",
str(source_h5mu_path),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think str is a requirement here anymore. Check other places as well.

@jakubmajercik

Copy link
Copy Markdown
Collaborator Author

Consolidated into #1166 as a unified metadata/copy_modality_slots component covering both move and insert use cases. Will close this PR once #1166 lands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants