Skip to content

Remove Composite Structure Family#1093

Merged
danielballan merged 18 commits intobluesky:mainfrom
genematx:remove-composite
Aug 26, 2025
Merged

Remove Composite Structure Family#1093
danielballan merged 18 commits intobluesky:mainfrom
genematx:remove-composite

Conversation

@genematx
Copy link
Contributor

@genematx genematx commented Aug 15, 2025

This pull request demotes Composite containers from structure_family to spec to simplify the set of supported structures in Tiled. The validation is performed both, client-side (when adding data to an existing container with composite spec) and server-side (when creating a new container or adding the composite spec to an existing one).

Main Changes

  • Refactored test logic to use create_container(..., specs=["composite"]) instead of the deprecated create_composite, aligning with the new API and improving clarity.
  • Migrated all composite container and table writing tests from tiled/_tests/test_writing.py to tiled/_tests/test_composite.py, expanding coverage with new tests for edge cases such as key collisions and validator logic.
  • Improved spec validation and propagation logic in adapters and validation tests by consistently using the Spec model and updating argument order for custom validators.
  • Added a key property to catalog adapters and cleaned up the lookup_adapter logic to remove unnecessary resolution for composite nodes.
  • Instead of composite.parts, to access the underlying container structure including the constituent tables, use composite.base, which returns an instance of the usual Container client.

Usage Example

Demo Example
from tiled.client import from_uri

c = from_uri("http://localhost:8000", api_key="secret")

X = c.create_container("X", specs=["composite"])

df = pd.DataFrame({"colA": np.random.randn(3),
                   "colB": np.random.randint(0, 10, 3),
                   "colC": np.random.choice(["a", "b", "c", "d", "e"], 3)})
X.write_dataframe(df, key="table1", specs=["table_spec"])
X.write_array(np.random.randn(3, 4), key="arr1")
X.write_array(np.random.randn(3, 4), key="arr2")
In [2]: c['X']
Out[2]: <CompositeClient {'colA', 'colB', 'colC', 'arr1', 'arr2'}>

In [3]: len(c['X'])
Out[3]: 5

In [4]: c['X'].base
Out[4]: <Container {'table1', 'arr1', 'arr2'}>

In [5]: len(c['X'].base)
Out[5]: 3

In [6]: c['X']['colA']
Out[6]: <ArrayClient shape=(3,) chunks=((3,)) dtype=float64>

In [7]: c['X']['table1']
KeyError: "Key 'table1' not found. If it refers to a table, access it via the base Container client using `.base['table1']` instead."

In [8]: c['X'].base['table1']
Out[8]: <DataFrameClient ['colA', 'colB', 'colC']>

In [9]: c['X'].read()
Out[9]:
<xarray.Dataset> Size: 252B
Dimensions:  (dim0: 3, arr1_dim1: 4, arr2_dim1: 4)
Dimensions without coordinates: dim0, arr1_dim1, arr2_dim1
Data variables:
    colA     (dim0) float64 24B 0.5138 -1.372 -0.6688
    colB     (dim0) int64 24B 6 3 8
    colC     (dim0) <U1 12B 'c' 'e' 'a'
    arr1     (dim0, arr1_dim1) float64 96B -0.6288 0.2017 ... -0.3045 -0.4624
    arr2     (dim0, arr2_dim1) float64 96B 0.03816 -1.46 ... 0.6756 -0.9974

In [10]: c['X'].base.read()
AttributeError: 'Container' object has no attribute 'read'

Testing Alembic Migration

The alembic migration scripts up/down-grade have been tested with PostgreSQL and SQLite catalogs.

Testing Script and Results

Script to generate the testing data

from pathlib import Path
from tiled.structures.core import Spec
from bluesky import RunEngine
from bluesky.callbacks.tiled_writer import TiledWriter
import bluesky.plans as bp
from tiled.client import from_uri
from ophyd.sim import det
from ophyd.sim import hw
import numpy as np
import pandas as pd

import stamina
stamina.set_active(False)  # Disable retries globally

client = c = from_uri("http://localhost:8000", api_key="secret")   # , include_data_sources=True

Y = c.create_container("Y", specs=[Spec("spec_3", "3.0")])

# Initialize the RunEngine and subscribe TiledWriter
RE = RunEngine()
tw = TiledWriter(c['Y'])
RE.subscribe(tw)

##### Internal Data Collection #####
uid, = RE(bp.count([det], 10))

#### External Data Collection #####
save_path = "./sandbox/storage/external"
Path(save_path).mkdir(parents=True, exist_ok=True)
uid, = RE(bp.count([hw(save_path=save_path).img], 10))

X = c.create_composite("X", specs=["my_spec", Spec("spec_2", "2.0")])  #, specs=["composite"])

df = pd.DataFrame({"colA": np.random.randn(3),
                   "colB": np.random.randint(0, 10, 3),
                   "colC": np.random.choice(["a", "b", "c", "d", "e"], 3)})
X.write_dataframe(df, specs=["data_frame_spec"])
X.write_array(np.random.randn(3, 4), key="arr1")
X.write_array(np.random.randn(3, 4), key="arr2")

for _ in range(3):
    Y.create_container()
Y.write_dataframe(df, specs=["data_frame_spec"])
Y.write_array(np.random.randn(3, 4), key="arr1")

PostgreSQL database before

catalog=# select id, parent, structure_family, specs from nodes;
 id | parent | structure_family |                                         specs
----+--------+------------------+----------------------------------------------------------------------------------------
  0 |        | container        | []
  1 |      0 | container        | [{"name": "spec_3", "version": "3.0"}]
  3 |      2 | container        | []
  2 |      1 | container        | [{"name": "BlueskyRun", "version": "3.0"}]
  7 |      6 | container        | []
 10 |      8 | array            | []
  6 |      1 | container        | [{"name": "BlueskyRun", "version": "3.0"}]
 13 |     11 | array            | []
 14 |     11 | array            | []
 15 |      1 | container        | []
 16 |      1 | container        | []
 17 |      1 | container        | []
 18 |      1 | table            | [{"name": "data_frame_spec", "version": null}]
 19 |      1 | array            | []
 11 |      0 | composite        | [{"name": "my_spec", "version": null}, {"name": "spec_2", "version": "2.0"}]
  8 |      7 | composite        | [{"name": "BlueskyEventStream", "version": "3.0"}]
  4 |      3 | composite        | [{"name": "BlueskyEventStream", "version": "3.0"}]
  5 |      4 | table            | [{"name": "flattened", "version": null}]
  9 |      8 | table            | [{"name": "flattened", "version": null}]
 12 |     11 | table            | [{"name": "data_frame_spec", "version": null}, {"name": "flattened", "version": null}]
(20 rows)

catalog=# select id, node_id, structure_family from data_sources;
 id | node_id | structure_family
----+---------+------------------
  1 |       5 | table
  2 |       9 | table
  3 |      10 | array
  4 |      12 | table
  5 |      13 | array
  6 |      14 | array
  7 |      18 | table
  8 |      19 | array
(8 rows)

PostgreSQL database after

catalog=# select id, parent, structure_family, specs from nodes;
 id | parent | structure_family |                                                        specs
----+--------+------------------+----------------------------------------------------------------------------------------------------------------------
  0 |        | container        | []
  1 |      0 | container        | [{"name": "spec_3", "version": "3.0"}]
  3 |      2 | container        | []
  2 |      1 | container        | [{"name": "BlueskyRun", "version": "3.0"}]
  7 |      6 | container        | []
 10 |      8 | array            | []
  6 |      1 | container        | [{"name": "BlueskyRun", "version": "3.0"}]
 13 |     11 | array            | []
 14 |     11 | array            | []
 15 |      1 | container        | []
 16 |      1 | container        | []
 17 |      1 | container        | []
 18 |      1 | table            | [{"name": "data_frame_spec", "version": null}]
 19 |      1 | array            | []
 11 |      0 | container        | [{"name": "my_spec", "version": null}, {"name": "spec_2", "version": "2.0"}, {"name": "composite", "version": null}]
  8 |      7 | container        | [{"name": "BlueskyEventStream", "version": "3.0"}, {"name": "composite", "version": null}]
  4 |      3 | container        | [{"name": "BlueskyEventStream", "version": "3.0"}, {"name": "composite", "version": null}]
  5 |      4 | table            | []
  9 |      8 | table            | []
 12 |     11 | table            | [{"name": "data_frame_spec", "version": null}]
(20 rows)

catalog=# select id, node_id, structure_family from data_sources;
 id | node_id | structure_family
----+---------+------------------
  1 |       5 | table
  2 |       9 | table
  3 |      10 | array
  4 |      12 | table
  5 |      13 | array
  6 |      14 | array
  7 |      18 | table
  8 |      19 | array
(8 rows)

Issue: #1081
Related:

Checklist

  • Add a Changelog entry
  • Add the ticket number which this PR closes to the comment section

@property
def parts(self):
return CompositeParts(self)
def base(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can call something else, of course. base is just shortest relevant name I could think of. In future, we can also put it in the base class itself and make it a general accessor to the relevant Tiled client without specs -- it shouldn't break the current implementation.

@genematx genematx changed the title WIP: Remove Composite Structure Family Remove Composite Structure Family Aug 20, 2025
@danielballan danielballan merged commit 06577e1 into bluesky:main Aug 26, 2025
5 of 9 checks passed
ZohebShaikh pushed a commit that referenced this pull request Feb 21, 2026
* REV: remove references to composite

* MNT: cleanup commented code

* TST: move tests for composite

* add normalize_spec util

* TST: fix tests for Composite client

* TST: fix tests on Windows and with py3.9

* ENH: server-side validation of the composite spec

* MNT: bring back codecov

* ENH: add the forward alembic migration script

* ENH: add downgrade alembic path for migration

* ENH: upgrade/downgrade data_sources table

* MNT: changelog

* DOC: update docs

* DOC: update docs

* ENH: remove composite parts

* FIX: f-string

---------

Co-authored-by: Dan Allan <dallan@bnl.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants