Skip to content

Concatenate from dictionary #998

@revalescente

Description

@revalescente

Hello,
I have 24 samples with the same elements and names. I'm trying to concatenate them but seems like it doesn't work. It stop at the first sample but I don't get why. I read the data as a dict {sample_name : sdata}.

spe_dict = sd.concatenate(spe_blocks, concatenate_tables = True, region_key = "region", instance_key = "cell_id")

>>> spe_blocks[samples[0]]['nuclei_counts'].obs
                     region       slide     cell_id  ...    extent  major_axis_length  minor_axis_length
aaaaaaaa-1  filtered_nuclei  full_image  aaaaaaaa-1  ...  0.734336          22.390856          16.715088
aaaaaaab-1  filtered_nuclei  full_image  aaaaaaab-1  ...  0.665591          35.952982          22.190399
aaaaaaac-1  filtered_nuclei  full_image  aaaaaaac-1  ...  0.773288          34.647159          25.419906
aaaaaaad-1  filtered_nuclei  full_image  aaaaaaad-1  ...  0.737557          27.155658          15.655043
aaaaaaaf-1  filtered_nuclei  full_image  aaaaaaaf-1  ...  0.554286          27.588200           9.553221
...                     ...         ...         ...  ...       ...                ...                ...
aaaakaaf-1  filtered_nuclei  full_image  aaaakaaf-1  ...  0.572727          23.858483          10.606041
aaaakaah-1  filtered_nuclei  full_image  aaaakaah-1  ...  0.577640          29.105765          18.097145
aaaakaai-1  filtered_nuclei  full_image  aaaakaai-1  ...  0.620915          17.394626           7.040765
aaaakaaj-1  filtered_nuclei  full_image  aaaakaaj-1  ...  0.448889          17.789848           8.546664
aaaakaak-1  filtered_nuclei  full_image  aaaakaak-1  ...  0.501832          19.694033          10.515163

[31755 rows x 9 columns]
>>> spe_blocks[samples[0]]['nuclei_counts'].uns
{'sopa_attrs': {'intensities': True, 'transcripts': True}, 'spatialdata_attrs': {'instance_key': 'cell_id', 'region': 'filtered_nuclei', 'region_key': 'region'}}

>>> spe_blocks[samples[0]]
SpatialData object
├── Images
│     ├── 'full_image': DataTree[cyx] (3, 8000, 16166), (3, 4000, 8083), (3, 2000, 4041), (3, 1000, 2020), (3, 500, 1010)
│     └── 'raster_nuclei': DataArray[cyx] (1, 7661, 13624)
├── Shapes
│     ├── 'filtered_bins': GeoDataFrame shape: (384309, 4) (2D shapes)
│     └── 'filtered_nuclei': GeoDataFrame shape: (31755, 1) (2D shapes)
└── Tables
      └── 'nuclei_counts': AnnData (31755, 32285)
with coordinate systems:
    ▸ 'blocco1', with elements:
        full_image (Images), raster_nuclei (Images), filtered_bins (Shapes), filtered_nuclei (Shapes)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/mnt/europa/valerio/.conda/envs/deepl-segment-bio/lib/python3.11/site-packages/spatialdata/_core/concatenate.py", line 139, in concatenate
    sdatas = _fix_ensure_unique_element_names(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/europa/valerio/.conda/envs/deepl-segment-bio/lib/python3.11/site-packages/spatialdata/_core/concatenate.py", line 257, in _fix_ensure_unique_element_names
    sdata = SpatialData.init_from_elements(elements, tables=tables)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/europa/valerio/.conda/envs/deepl-segment-bio/lib/python3.11/site-packages/spatialdata/_core/spatialdata.py", line 2298, in init_from_elements
    return cls(**elements_dict, attrs=attrs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/europa/valerio/.conda/envs/deepl-segment-bio/lib/python3.11/site-packages/spatialdata/_utils.py", line 270, in wrapper
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/mnt/europa/valerio/.conda/envs/deepl-segment-bio/lib/python3.11/site-packages/spatialdata/_core/spatialdata.py"
, line 156, in __init__
    with raise_validation_errors(
  File "/mnt/europa/valerio/.conda/envs/deepl-segment-bio/lib/python3.11/site-packages/spatialdata/_core/validation.py", line 382, in __exit__
    raise ValidationError(title=self._message, errors=self._collector.errors)
spatialdata._core.validation.ValidationError: Cannot construct SpatialData object, input contains invalid elements.
For renaming, please see the discussion here https://github.com/scverse/spatialdata/discussions/707 .
  tables/nuclei_counts: value.index does not match parent’s obs names:
Index are different

Index values are different (100.0 %)
[left]:  Index(['aaaaaaaa-1', 'aaaaaaab-1', 'aaaaaaac-1', 'aaaaaaad-1', 'aaaaaaaf-1',
       'aaaaaaag-1', 'aaaaaaam-1', 'aaaaaaap-1', 'aaaaaabb-1', 'aaaaaabd-1',
       ...
       'aaaajppo-1', 'aaaajppp-1', 'aaaakaaa-1', 'aaaakaab-1', 'aaaakaad-1',
       'aaaakaaf-1', 'aaaakaah-1', 'aaaakaai-1', 'aaaakaaj-1', 'aaaakaak-1'],
      dtype='object', length=31755)
[right]: Index(['aaaaaaaa-1-blocco1_c26STAT3', 'aaaaaaab-1-blocco1_c26STAT3',
       'aaaaaaac-1-blocco1_c26STAT3', 'aaaaaaad-1-blocco1_c26STAT3',
       'aaaaaaaf-1-blocco1_c26STAT3', 'aaaaaaag-1-blocco1_c26STAT3',
       'aaaaaaam-1-blocco1_c26STAT3', 'aaaaaaap-1-blocco1_c26STAT3',
       'aaaaaabb-1-blocco1_c26STAT3', 'aaaaaabd-1-blocco1_c26STAT3',
       ...
       'aaaajppo-1-blocco1_c26STAT3', 'aaaajppp-1-blocco1_c26STAT3',
       'aaaakaaa-1-blocco1_c26STAT3', 'aaaakaab-1-blocco1_c26STAT3',
       'aaaakaad-1-blocco1_c26STAT3', 'aaaakaaf-1-blocco1_c26STAT3',
       'aaaakaah-1-blocco1_c26STAT3', 'aaaakaai-1-blocco1_c26STAT3',
       'aaaakaaj-1-blocco1_c26STAT3', 'aaaakaak-1-blocco1_c26STAT3'],
      dtype='object', length=31755)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions