-
Notifications
You must be signed in to change notification settings - Fork 52
WIP: Add Histology, MicroCT, and Electrode Localization Pipelines #1294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds new pipelines for histology, microCT scanning, and electrode localization by introducing several new DataJoint table definitions and merge constructs.
- Introduces microCT and histology data models with scan/preparation metadata and image/table linkages.
- Adds electrode localization tables for both microCT and histology, along with a merge table to collate location information.
- Updates the common region definitions by adding an atlas name field.
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/spyglass/micro_ct/v1/micro_ct.py | New microCT scan and image table definitions supporting CT metadata. |
| src/spyglass/histology/v1/histology.py | New histology table definitions for tissue preparation and imaging. |
| src/spyglass/electrode_localization/v1/micro_ct.py | Adds alignment and electrode localization tables for microCT data. |
| src/spyglass/electrode_localization/v1/histology.py | Adds alignment and electrode localization tables for histology images. |
| src/spyglass/electrode_localization/v1/coordinate_system.py | New lookup table for standard coordinate systems. |
| src/spyglass/electrode_localization/localization_merge.py | Merge table for collating electrode localization information. |
| src/spyglass/common/common_region.py | Updated region definitions with an added atlas name field. |
| color_to_stain = NULL: blob # Mapping of color channels to stains (e.g., {'DAPI': 'blue', 'GFAP': 'green'}) | ||
| pixel_size_x: float # (um) Pixel size in X direction | ||
| pixel_size_y: float # (um) Pixel size in Y direction | ||
| pixel_size_z: float # (um) Pixel size in Z direction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make sense?
| # Placeholder for actual implementation | ||
| # This function should create an AnalysisNwbfile entry and link it to the Histology entry | ||
| # It should also populate the color_to_stain and scale fields based on the image data | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should the analysisNWB file be made? We usually have this correspond to subject + date but this would be per subject.
| --- | ||
| -> CoordinateSystem # The TARGET coordinate system achieved by this registration (e.g., 'allen_ccf_v3_ras_um') | ||
| registration_method : varchar(128) # algorithmic approach, e.g. 'affine+bspline' | ||
| registration_software : varchar(128) # e.g. 'ANTs', 'elastix', 'SimpleITK' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there is more than one software?
| registration_software : varchar(128) # e.g. 'ANTs', 'elastix', 'SimpleITK' | ||
| registration_software_version : varchar(64) # e.g. '2.3.5', '1.3.0' | ||
| transformation_matrix = NULL: blob # Store affine matrix if computed/applicable (e.g., 4x4 np.array.tobytes()) | ||
| warp_field_path = NULL: varchar(512) # Store path to warp field file if non-linear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these parameters necessary? Should there be a separate params table?
src/spyglass/micro_ct/v1/micro_ct.py
Outdated
| scan_date=NULL: date # Date of the scan itself | ||
| scanner_details: varchar(255) # e.g., 'Nikon Metrology X-Tek HMX ST 225 @ HCNS' | ||
| source_target_type = 'Molybdenum': varchar(64) # X-ray source target material | ||
| filter_material = 'None': varchar(64) # Filter material used (e.g., None, Copper) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all these necessary? They were reported in the microCT paper
src/spyglass/micro_ct/v1/micro_ct.py
Outdated
| registration_method : varchar(128) # algorithmic approach, e.g. 'affine+bspline' | ||
| registration_software : varchar(128) # e.g. 'ANTs', 'elastix', 'SimpleITK' | ||
| registration_software_version : varchar(64) # e.g. '2.3.5', '1.3.0' | ||
| registration_params = NULL: blob # Store parameters dict/json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, should this be a parameter table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces new DataJoint table definitions to support the processing pipelines for microCT, histology, and electrode localization, along with updating common region and coordinate system definitions.
- Added microCT pipelines with scan, image, and registration tables.
- Added histology pipelines with preparation, image, and registration tables.
- Introduced electrode localization tables for both microCT and histology data with a merge table and updated the common region schema.
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/spyglass/micro_ct/v1/micro_ct.py | New microCT scan, image, and registration table definitions |
| src/spyglass/histology/v1/histology.py | New histology preparation, image, and registration definitions |
| src/spyglass/electrode_localization/v1/micro_ct.py | Electrode localization using microCT data (naming inconsistency noted) |
| src/spyglass/electrode_localization/v1/histology.py | Electrode localization using histology data (naming inconsistency noted) |
| src/spyglass/electrode_localization/localization_merge.py | Merge table unifying electrode localization entries |
| src/spyglass/common/common_region.py | Updated BrainRegion with atlas_name; added CoordinateSystem lookup |
| src/spyglass/common/init.py | Updated imports to include CoordinateSystem |
Comments suppressed due to low confidence (2)
src/spyglass/electrode_localization/v1/micro_ct.py:9
- The imported name 'MicroCTImage' does not match the defined table 'MicroCTImages' in micro_ct.py. Consider renaming either the import or the table for consistency.
MicroCTImage,
src/spyglass/electrode_localization/v1/histology.py:9
- The imported name 'HistologyImage' does not match the defined table 'HistologyImages' in histology.py. Ensure consistent naming between the import and the table definition.
HistologyImage,
| pixel_size_z: float # (um) Pixel size in Z dimension (often slice_thickness or scan step) | ||
| objective_magnification: float # Magnification of the objective lens (e.g., 20 for 20x) | ||
| image_modality: enum( # Modality of the microscopy | ||
| "fluorescence", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this list exhaustive? Would we want to make this a varchar for flexibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above: if unsure, then convert
| --- | ||
| identified_feature: varchar(128) # Biological target/marker (e.g., 'Nissl Bodies', 'ChR2-tdTomato+', 'ProbeTrack_DiI') | ||
| visualization_agent: varchar(128)# Method/molecule making feature visible (e.g., 'Cresyl Violet', 'Native tdTomato', 'DiI', 'Alexa 488') | ||
| stain_type: enum( # Type of staining method used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this list exhaustive? Would we want to make this a varchar for flexibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're not sure, then yes, I'd convert
| subregion_name=subregion_name, | ||
| subsubregion_name=subsubregion_name, | ||
| ) | ||
| query = BrainRegion & key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider removing this function or only allowing brain regions from a specific atlas.
| subregion_name=NULL: varchar(200) # subregion name | ||
| subsubregion_name=NULL: varchar(200) # subregion within subregion | ||
| region_name: varchar(200) # Name of the region (e.g., 'Hippocampal formation') | ||
| region_abbr=NULL: varchar(64) # Standard abbreviation (e.g., 'HPF') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is having a abbreviation helpful? We already only use the region name.
src/spyglass/common/common_region.py
Outdated
| coordinate_system_id: varchar(64) # Primary key (e.g., 'Allen_CCFv3_RAS_um') | ||
| --- | ||
| description: varchar(255) # Description of the coordinate system | ||
| orientation: enum( # Anatomical orientation convention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this too flexible?
|
Other systems have ProbeTrack and ElectrodePosition to track insertions and the electrode position separate determining the brain region. Should we add these? |
| scanner_details: varchar(255) # e.g., 'Nikon Metrology X-Tek HMX ST 225 @ HCNS' | ||
| source_target_type='Molybdenum': varchar(64) # X-ray source target material | ||
| filter_material='None': varchar(64) # Filter material used (e.g., 'None', 'Copper') | ||
| filter_thickness_mm=0.0: float # (mm) Filter thickness |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this overly detailed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It becomes unwieldy if you can't see the full table in a notebook. Part tables can handle subsets of detail. if scan/reconstruction details are likely to be the same across many entries, that's worth being a params-like table:
class ScannerSettings(dj.Manual):
definition = """
scanner_setting_name: varchar(64) # eg NikonXTek_default
---
filter_x : type
voltage: type
...
"""This might also be true of preparation protocol for details like stain/resin/etc
I think I'd like to see this table look more like a selection table:
@schema
class MicroCTScan(SpyglassMixin, dj.Manual):
definition = """
-> Subject
scan_id : int auto_increment
---
scan_date=CURRENT_DATE: date
-> ScannerSettings
-> StainProtocol
... # notes, files, etc
"""|
|
||
|
|
||
| @schema | ||
| class HistologyImages(SpyglassMixin, dj.Computed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are multiple HistologyImage for one Histology entry, would it make more sense to make this a part table of Histology?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CBroz1 do you have an opinion on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes? It would be helpful for me to talk through a use case, but yes, moving to a part table seems right
What happens to these images downstream? Is some analysis just going to look at the brain region or electrode coordinates and ignore image-specific info? If so, if no downstream table cares about specific images, then we might be better off just including image-specific info in blobs/nwbs that we're unlikely to fetch - for the sake of streamlining primary keys and cutting down on the table count
The current approach seems low table count but high information. I could either see us increasing the table count to better accommodate the same info, or reducing the info for the v1 version of this pipeline and conceding that the info can be dumped in less-accessible places for the sake of pipeline streamlining
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently these images aren't used for electrode localization, but they could be in the future. I guess it depends on how much we want to worry about this in v1.
| # --- Data Source --- | ||
| output_format='TIFF stack': varchar(64) # Format of raw image data from scanner | ||
| raw_scan_path: varchar(512) # Path to the raw output (e.g., folder containing TIFF stack) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are user's intended to access the images from this path or the AnalysisNwbfile? If the latter, we might not need to store it in the table, but rather pass it to an insertion method
|
|
||
|
|
||
| @schema | ||
| class MicroCTImages(SpyglassMixin, dj.Computed): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar questions as above of whether this should be a part table
CBroz1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tables are super detailed. Cutting down on the number of tables in a pipeline make for a more linear process to the end-user, but it means a lot of redundant SKs in tables with many fields. I make the recommendation to separate out to params/protocol/part tables for the sake of cutting down the number of fields in any one table
| subregion_abbr=NULL: varchar(64) # Subregion abbreviation (e.g., 'CA1') | ||
| subsubregion_name=NULL: varchar(200) # Sub-subregion name (e.g., 'stratum pyramidale') | ||
| subsubregion_abbr=NULL: varchar(64) # Sub-subregion abbreviation (e.g., 'sp') | ||
| atlas_source=NULL: varchar(128) # Source atlas (e.g., 'Allen CCF v3', 'Paxinos Rat 6th Ed') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would make this a nullable foreign key reference to an imported table that handles fetching atlases
| "Please make sure to check the spelling and format." | ||
| "Remove any extra spaces or special characters." | ||
| ) | ||
| cls.insert1(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a meeting we discussed limiting this operation to admin
| pos_x: float # (um) coordinate in the specified space | ||
| pos_y: float # (um) coordinate in the specified space | ||
| pos_z: float # (um) coordinate in the specified space | ||
| -> BrainRegion # Assigned brain region |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nullable? Is there ever a case where this is not known, or the value retrieved doesn't make sense in this case - either a grey area or individual differences for the animal vs averaged atlas?
Is this an imported table? Is BrainRegion calculated based on the positions found?
| pos_x: float # (um) coordinate in the specified space | ||
| pos_y: float # (um) coordinate in the specified space | ||
| pos_z: float # (um) coordinate in the specified space | ||
| -> BrainRegion # Assigned brain region |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same questions as histology table above
| definition = """ | ||
| # Represents a single histology preparation for a subject |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| definition = """ | |
| # Represents a single histology preparation for a subject | |
| definition = # A single histology preparation for a subject |
| definition = """ | ||
| # Metadata for a microCT scan of a subject's brain/tissue | ||
| -> Subject | ||
| scan_id: varchar(32) # User-defined ID (e.g., 'SubjX_OsO4_Scan1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't suggest including subj id if that's in the pk
| scan_id: varchar(32) # User-defined ID (e.g., 'SubjX_OsO4_Scan1') | |
| scan_id: varchar(32) # User-defined ID (e.g., 'OsO4_Scan1') |
| output_format='TIFF stack': varchar(64) # Format of raw reconstructed data | ||
| # --- Data Location & Notes --- | ||
| raw_scan_path: varchar(512) # Path to raw output (e.g., folder containing TIFF stack) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR includes a few cases of varchar instead of filepath. We seem to only use the filepath attr for nwb files. It adds some complexity to the config, but saves us a lot of cleanup time with various cleanup functions that look for orphaned files. Moving forward, I think I'd advocate for varchars only in cases where we expect the file to change and don't want to update the checksum
| definition = """ | ||
| # Links MicroCTScan info to the Analysis NWB file containing the image volume | ||
| -> MicroCTScan | ||
| images_id: varchar(32) # User-defined ID for these images (e.g., scan_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is a varchar pk preferable to the anlysis file being the pk? I think the former has issues with clashing user-defined strings, and the latter is more common across spyglass tables
| definition = """ | ||
| # Stores results/params of aligning microCT image data to a target coordinate system | ||
| -> MicroCTImages # Link to the source microCT NWB file info | ||
| registration_id: varchar(32) # Unique ID for this registration instance/parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I think I might swap out for int here
| --- | ||
| -> CoordinateSystem # The TARGET coordinate system (e.g., 'allen_ccf_v3_ras_um') | ||
| # --- Registration Parameters --- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, maybe protocol/param tables - it might be the case that a generalized registration table could be a params table for both schemas
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #1294 +/- ##
==========================================
+ Coverage 69.27% 69.34% +0.06%
==========================================
Files 103 108 +5
Lines 12336 12597 +261
==========================================
+ Hits 8546 8735 +189
- Misses 3790 3862 +72 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds new DataJoint pipelines for microCT, histology, and electrode localization, plus updates to common region definitions.
- Introduce microCT scan, image, and registration tables (
microct_v1schema). - Add histology preparation, stain, image, and registration tables (
histology_v1schema). - Implement electrode localization tables for both modalities and a merge table to collate their outputs.
- Extend
common_regionwith coordinate system lookup and atlas/source fields.
Reviewed Changes
Copilot reviewed 8 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/spyglass/micro_ct/v1/micro_ct.py | New tables for microCT scans, images, and registration |
| src/spyglass/histology/v1/histology.py | New tables for histology prep, stains, images, and registration |
| src/spyglass/electrode_localization/v1/micro_ct.py | Manual table for microCT-derived electrode channel locations |
| src/spyglass/electrode_localization/v1/histology.py | Manual table for histology-derived electrode channel locations |
| src/spyglass/electrode_localization/localization_merge.py | Merge table combining histology and microCT channel localizations |
| src/spyglass/common/common_region.py | Enhanced BrainRegion lookup and new BrainCoordinateSystem |
| src/spyglass/common/init.py | Export BrainCoordinateSystem alongside BrainRegion |
Comments suppressed due to low confidence (3)
src/spyglass/micro_ct/v1/micro_ct.py:53
- New computed classes like
MicroCTImageslack accompanying unit tests for themakemethod. Consider adding tests to validate image processing and NWB insertion logic.
@schema
src/spyglass/micro_ct/v1/micro_ct.py:5
- The import
CoordinateSystemdoes not exist inspyglass.common. It should be replaced withBrainCoordinateSystemor the correct class name from common.
CoordinateSystem,
src/spyglass/common/common_region.py:57
- The logger message strings are concatenated without a space before 'Remove any extra spaces...'. Consider adding a space or newline to avoid 'format.Remove' in the output.
"Please make sure to check the spelling and format."
|
Just an FYI - if you're adding electrode localization and would like to import this from nwb, you should consider taking a look at ndx-anatomical-localization, which aims to represent this kind of thing in NWB. |
Description
This PR adds new pipelines for histology, microCT scanning, and electrode localization by introducing several new DataJoint table definitions and a new merge table.
Checklist:
CITATION.cffaltersnippet for release notes.CHANGELOG.mdwith PR number and description.