Skip to content

Allow Multipart hdf5 Datasets#19

Open
genematx wants to merge 1 commit intobluesky:mainfrom
genematx:multipart-hdf5
Open

Allow Multipart hdf5 Datasets#19
genematx wants to merge 1 commit intobluesky:mainfrom
genematx:multipart-hdf5

Conversation

@genematx
Copy link
Contributor

This enables support for consolidating datasets that consist of multiple hdf5 files. The HDF5Adapter in Tiled already can accept multiple uris and concatenate them into a single array. However, if the filenames are passed in the Resource document in the form of a template, HDF5Consolidator did not have the ability to register these files per each Datum (as done for sequences of tiff or jpeg file in MultipartRelatedConsolidator).

This PR declares a new MultipartHDF5Consolidator to be used for datasets of "application/x-hdf5" mimetype, but only if the Resource document declares a "template" in resource_kwargs (or parameters in StreamResource). Otherwise, the usual HDF5Consolidator is used for the same mimetype.

Related discussion in Mattermost: https://mattermost.hzdr.de/bluesky/pl/uhpt8h54bpn9iyoxpb65tupr9y

@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.90%. Comparing base (77301fa) to head (117fe58).

Files with missing lines Patch % Lines
src/bluesky_tiled_plugins/writing/consolidators.py 83.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #19      +/-   ##
==========================================
+ Coverage   68.83%   68.90%   +0.07%     
==========================================
  Files          14       14              
  Lines        1832     1833       +1     
==========================================
+ Hits         1261     1263       +2     
+ Misses        571      570       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@prjemian
Copy link

Here's one of the cases I want to test, with different values for num_acquire & num_images (num_exposures does not result in a different number of acquired frames).

acquire_time = 0.01
gap_between_frames = 0.001
num_acquire = 2  # num arg to count plan
num_exposures = 1  # number of camera exposures per frame (not supported by all detectors)
num_images = 3  # number of frames per acquire

adsimdet.cam.stage_sigs["acquire_time"] = acquire_time
adsimdet.cam.stage_sigs["image_mode"] = "Multiple"
adsimdet.cam.stage_sigs["num_exposures"] = num_exposures
adsimdet.cam.stage_sigs["num_images"] = num_images

adsimdet.hdf1.stage_sigs["num_capture"] = num_images
adsimdet.hdf1.stage_sigs.move_to_end("capture", last=True)

I'm expecting a 4-D dataset: (2, 3, nx, ny) but the HDF5 file data has shape (3, nx, ny). Inspecting how bluesky.plans.count works, the detector is staged, then iterates over num_acquire. The HDF5 plugin is not staged for additional acquisitions.

Not a situation to be fixed here, but one of the challenges in reviewing.

@prjemian
Copy link

Acquisition results in a single resource and two datum docs.

(uid,) = RE(  # noqa
    bp.count(  # noqa
        [adsimdet],
        num_acquire,
        md=dict(
            title="Area Detector with default HDF5 File Name",
            purpose="image"),
        )
    )

run = cat[uid]

but tiled client fails to get the dataset:

dataset = run.primary.read()

(10 tries resulting in HTTPStatusError: Server error '500 Internal Server Error').

@prjemian
Copy link

Setting num_acquire = 1 and selecting any number of images results in data that tiled.client can provide from the HDF5 file, with this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants