Skip to content

Update DynaCell dataset entry (NeurIPS 2026 E&D submission)#3130

Open
mattersoflight wants to merge 6 commits into
awslabs:mainfrom
czbiohub-sf:dynacell
Open

Update DynaCell dataset entry (NeurIPS 2026 E&D submission)#3130
mattersoflight wants to merge 6 commits into
awslabs:mainfrom
czbiohub-sf:dynacell

Conversation

@mattersoflight

@mattersoflight mattersoflight commented May 5, 2026

Copy link
Copy Markdown

Dataset

Name: DynaCell - an Evaluation Framework for Dynamic 3D Virtual Staining of Live Cells

Bucket: s3://dynacell (region us-west-2, public)

Size: ~407 GB (379 GiB) across 42 objects; the v1 release ships 24 OZX-packed OME-Zarr stores covering A549 human lung adenocarcinoma cells imaged on the Mantis correlative label-free / light-sheet fluorescence microscope at Biohub. Four organelle markers (H2B, CAAX, SEC61B, TOMM20) × three perturbation conditions (mock, ZIKV, DENV) = 24 stores; 262 FOVs total.

License: CC BY 4.0 for the A549 component. The forthcoming v1.1 hiPSC component (derived from the Allen Institute hiPSC Single-cell Image Dataset, Viana et al., Nature 2023) will be redistributed under the Allen Institute Terms of Use; the description notes this distinction.

Croissant metadata: Published at s3://dynacell/v1/metadata/croissant.json with Responsible AI fields, per the NeurIPS Datasets & Benchmarks track requirement.

Validation

  • pykwalify -d datasets/dynacell.yaml -s schema.yamlINFO - validation.valid
  • All 12 tags exist in tags.yaml
  • All Documentation, AuthorURL, and project URLs return HTTP 200
  • ARN matches bucket (arn:aws:s3:::dynacell, region us-west-2)

Known follow-ups (not blocking this PR)

  • The Documentation URL points to a feature branch (modular-viscy-staging); this will move to main once the upstream VisCy PR merges.
  • The Publications.URL will update to the OpenReview / arXiv DOI once the paper is publicly available (currently points to the code tree).
  • The Croissant file on S3 is the legacy 1.0 hand-written version; an auto-generated 1.1 is forthcoming from the project's Croissant builder.

mattersoflight and others added 6 commits May 2, 2026 12:04
…schema fixes

- Title now matches the paper: "DynaCell: an Evaluation Framework for
  Dynamic 3D Virtual Staining of Live Cells" (was "A Dynamic 3D
  Live-Cell Imaging Benchmark for Virtual Staining and Cell Profiling")
- Full 15-author list (Kalinin, Zheng, Theodoro, Ivanov, Hirata-Miyasaki,
  Lee, Liu, Varra, Chandler, Pradeep, Liu, Leonetti, Arias, Huang, Mehta)
- Biohub branding: ManagedBy URL, Contact, AuthorURLs all aligned to
  https://www.biohub.org/comp-micro (Computational Microscopy Group)
- Documentation URL points to the VisCy applications/dynacell tree
- iPSC component framed as v1.1 (Allen Institute Terms of Use)
- Tags trimmed to entries that exist in tags.yaml (12 valid;
  image-based profiling replaces cell profiling)
- License field collapsed to a clean CC BY 4.0 link; AICS terms covered
  in Description
- Added RegistryEntryAdded / RegistryEntryLastModified per schema
- pykwalify schema validation: PASS (validation.valid)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@berylrab

berylrab commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Hi @mattersoflight checking in to see if your tutorial is ready for review so we can merge this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants