Skip to content

Conversation

@edeno
Copy link
Collaborator

@edeno edeno commented Apr 24, 2025

Description

This PR adds new pipelines for histology, microCT scanning, and electrode localization by introducing several new DataJoint table definitions and a new merge table.

  • Introduces microCT and histology data models with scan/preparation metadata and image/table linkages.
  • Adds electrode localization tables for both microCT and histology, along with a merge table to collate location information.
  • Updates the common region definitions by adding an atlas name field.

Checklist:

  • This PR should be accompanied by a release: (yes/no/unsure)
  • If release, I have updated the CITATION.cff
  • This PR makes edits to table definitions: (yes/no)
  • If table edits, I have included an alter snippet for release notes.
  • If this PR makes changes to position, I ran the relevant tests locally.
  • I have updated the CHANGELOG.md with PR number and description.
  • I have added/edited docs/notebooks to reflect the changes

@edeno edeno requested a review from Copilot April 24, 2025 11:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds new pipelines for histology, microCT scanning, and electrode localization by introducing several new DataJoint table definitions and merge constructs.

  • Introduces microCT and histology data models with scan/preparation metadata and image/table linkages.
  • Adds electrode localization tables for both microCT and histology, along with a merge table to collate location information.
  • Updates the common region definitions by adding an atlas name field.

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/spyglass/micro_ct/v1/micro_ct.py New microCT scan and image table definitions supporting CT metadata.
src/spyglass/histology/v1/histology.py New histology table definitions for tissue preparation and imaging.
src/spyglass/electrode_localization/v1/micro_ct.py Adds alignment and electrode localization tables for microCT data.
src/spyglass/electrode_localization/v1/histology.py Adds alignment and electrode localization tables for histology images.
src/spyglass/electrode_localization/v1/coordinate_system.py New lookup table for standard coordinate systems.
src/spyglass/electrode_localization/localization_merge.py Merge table for collating electrode localization information.
src/spyglass/common/common_region.py Updated region definitions with an added atlas name field.

color_to_stain = NULL: blob # Mapping of color channels to stains (e.g., {'DAPI': 'blue', 'GFAP': 'green'})
pixel_size_x: float # (um) Pixel size in X direction
pixel_size_y: float # (um) Pixel size in Y direction
pixel_size_z: float # (um) Pixel size in Z direction
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make sense?

# Placeholder for actual implementation
# This function should create an AnalysisNwbfile entry and link it to the Histology entry
# It should also populate the color_to_stain and scale fields based on the image data
pass
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should the analysisNWB file be made? We usually have this correspond to subject + date but this would be per subject.

---
-> CoordinateSystem # The TARGET coordinate system achieved by this registration (e.g., 'allen_ccf_v3_ras_um')
registration_method : varchar(128) # algorithmic approach, e.g. 'affine+bspline'
registration_software : varchar(128) # e.g. 'ANTs', 'elastix', 'SimpleITK'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is more than one software?

registration_software : varchar(128) # e.g. 'ANTs', 'elastix', 'SimpleITK'
registration_software_version : varchar(64) # e.g. '2.3.5', '1.3.0'
transformation_matrix = NULL: blob # Store affine matrix if computed/applicable (e.g., 4x4 np.array.tobytes())
warp_field_path = NULL: varchar(512) # Store path to warp field file if non-linear
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these parameters necessary? Should there be a separate params table?

scan_date=NULL: date # Date of the scan itself
scanner_details: varchar(255) # e.g., 'Nikon Metrology X-Tek HMX ST 225 @ HCNS'
source_target_type = 'Molybdenum': varchar(64) # X-ray source target material
filter_material = 'None': varchar(64) # Filter material used (e.g., None, Copper)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these necessary? They were reported in the microCT paper

registration_method : varchar(128) # algorithmic approach, e.g. 'affine+bspline'
registration_software : varchar(128) # e.g. 'ANTs', 'elastix', 'SimpleITK'
registration_software_version : varchar(64) # e.g. '2.3.5', '1.3.0'
registration_params = NULL: blob # Store parameters dict/json
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, should this be a parameter table?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces new DataJoint table definitions to support the processing pipelines for microCT, histology, and electrode localization, along with updating common region and coordinate system definitions.

  • Added microCT pipelines with scan, image, and registration tables.
  • Added histology pipelines with preparation, image, and registration tables.
  • Introduced electrode localization tables for both microCT and histology data with a merge table and updated the common region schema.

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/spyglass/micro_ct/v1/micro_ct.py New microCT scan, image, and registration table definitions
src/spyglass/histology/v1/histology.py New histology preparation, image, and registration definitions
src/spyglass/electrode_localization/v1/micro_ct.py Electrode localization using microCT data (naming inconsistency noted)
src/spyglass/electrode_localization/v1/histology.py Electrode localization using histology data (naming inconsistency noted)
src/spyglass/electrode_localization/localization_merge.py Merge table unifying electrode localization entries
src/spyglass/common/common_region.py Updated BrainRegion with atlas_name; added CoordinateSystem lookup
src/spyglass/common/init.py Updated imports to include CoordinateSystem
Comments suppressed due to low confidence (2)

src/spyglass/electrode_localization/v1/micro_ct.py:9

  • The imported name 'MicroCTImage' does not match the defined table 'MicroCTImages' in micro_ct.py. Consider renaming either the import or the table for consistency.
    MicroCTImage,

src/spyglass/electrode_localization/v1/histology.py:9

  • The imported name 'HistologyImage' does not match the defined table 'HistologyImages' in histology.py. Ensure consistent naming between the import and the table definition.
    HistologyImage,

@edeno edeno requested a review from samuelbray32 April 24, 2025 13:33
@edeno edeno requested review from CBroz1 and sharon-chiang April 24, 2025 13:55
@edeno edeno requested a review from Copilot April 24, 2025 13:56

This comment was marked as outdated.

@edeno edeno requested a review from xlsun79 April 24, 2025 14:08
pixel_size_z: float # (um) Pixel size in Z dimension (often slice_thickness or scan step)
objective_magnification: float # Magnification of the objective lens (e.g., 20 for 20x)
image_modality: enum( # Modality of the microscopy
"fluorescence",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this list exhaustive? Would we want to make this a varchar for flexibility?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: if unsure, then convert

---
identified_feature: varchar(128) # Biological target/marker (e.g., 'Nissl Bodies', 'ChR2-tdTomato+', 'ProbeTrack_DiI')
visualization_agent: varchar(128)# Method/molecule making feature visible (e.g., 'Cresyl Violet', 'Native tdTomato', 'DiI', 'Alexa 488')
stain_type: enum( # Type of staining method used
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this list exhaustive? Would we want to make this a varchar for flexibility?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're not sure, then yes, I'd convert

subregion_name=subregion_name,
subsubregion_name=subsubregion_name,
)
query = BrainRegion & key
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing this function or only allowing brain regions from a specific atlas.

subregion_name=NULL: varchar(200) # subregion name
subsubregion_name=NULL: varchar(200) # subregion within subregion
region_name: varchar(200) # Name of the region (e.g., 'Hippocampal formation')
region_abbr=NULL: varchar(64) # Standard abbreviation (e.g., 'HPF')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is having a abbreviation helpful? We already only use the region name.

coordinate_system_id: varchar(64) # Primary key (e.g., 'Allen_CCFv3_RAS_um')
---
description: varchar(255) # Description of the coordinate system
orientation: enum( # Anatomical orientation convention
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this too flexible?

@edeno
Copy link
Collaborator Author

edeno commented Apr 24, 2025

Other systems have ProbeTrack and ElectrodePosition to track insertions and the electrode position separate determining the brain region. Should we add these?

scanner_details: varchar(255) # e.g., 'Nikon Metrology X-Tek HMX ST 225 @ HCNS'
source_target_type='Molybdenum': varchar(64) # X-ray source target material
filter_material='None': varchar(64) # Filter material used (e.g., 'None', 'Copper')
filter_thickness_mm=0.0: float # (mm) Filter thickness
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this overly detailed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It becomes unwieldy if you can't see the full table in a notebook. Part tables can handle subsets of detail. if scan/reconstruction details are likely to be the same across many entries, that's worth being a params-like table:

class ScannerSettings(dj.Manual):
    definition = """
    scanner_setting_name: varchar(64) # eg NikonXTek_default
    ---
    filter_x : type
    voltage: type
    ...
    """

This might also be true of preparation protocol for details like stain/resin/etc

I think I'd like to see this table look more like a selection table:

@schema
class MicroCTScan(SpyglassMixin, dj.Manual):
    definition = """
    -> Subject
    scan_id : int auto_increment
    ---
    scan_date=CURRENT_DATE: date
    -> ScannerSettings
    -> StainProtocol
    ... # notes, files, etc
    """

@edeno edeno marked this pull request as ready for review April 24, 2025 15:09


@schema
class HistologyImages(SpyglassMixin, dj.Computed):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are multiple HistologyImage for one Histology entry, would it make more sense to make this a part table of Histology?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CBroz1 do you have an opinion on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes? It would be helpful for me to talk through a use case, but yes, moving to a part table seems right

What happens to these images downstream? Is some analysis just going to look at the brain region or electrode coordinates and ignore image-specific info? If so, if no downstream table cares about specific images, then we might be better off just including image-specific info in blobs/nwbs that we're unlikely to fetch - for the sake of streamlining primary keys and cutting down on the table count

The current approach seems low table count but high information. I could either see us increasing the table count to better accommodate the same info, or reducing the info for the v1 version of this pipeline and conceding that the info can be dumped in less-accessible places for the sake of pipeline streamlining

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently these images aren't used for electrode localization, but they could be in the future. I guess it depends on how much we want to worry about this in v1.

# --- Data Source ---
output_format='TIFF stack': varchar(64) # Format of raw image data from scanner
raw_scan_path: varchar(512) # Path to the raw output (e.g., folder containing TIFF stack)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are user's intended to access the images from this path or the AnalysisNwbfile? If the latter, we might not need to store it in the table, but rather pass it to an insertion method



@schema
class MicroCTImages(SpyglassMixin, dj.Computed):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar questions as above of whether this should be a part table

Copy link
Member

@CBroz1 CBroz1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tables are super detailed. Cutting down on the number of tables in a pipeline make for a more linear process to the end-user, but it means a lot of redundant SKs in tables with many fields. I make the recommendation to separate out to params/protocol/part tables for the sake of cutting down the number of fields in any one table

subregion_abbr=NULL: varchar(64) # Subregion abbreviation (e.g., 'CA1')
subsubregion_name=NULL: varchar(200) # Sub-subregion name (e.g., 'stratum pyramidale')
subsubregion_abbr=NULL: varchar(64) # Sub-subregion abbreviation (e.g., 'sp')
atlas_source=NULL: varchar(128) # Source atlas (e.g., 'Allen CCF v3', 'Paxinos Rat 6th Ed')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would make this a nullable foreign key reference to an imported table that handles fetching atlases

"Please make sure to check the spelling and format."
"Remove any extra spaces or special characters."
)
cls.insert1(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a meeting we discussed limiting this operation to admin

pos_x: float # (um) coordinate in the specified space
pos_y: float # (um) coordinate in the specified space
pos_z: float # (um) coordinate in the specified space
-> BrainRegion # Assigned brain region
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nullable? Is there ever a case where this is not known, or the value retrieved doesn't make sense in this case - either a grey area or individual differences for the animal vs averaged atlas?

Is this an imported table? Is BrainRegion calculated based on the positions found?

pos_x: float # (um) coordinate in the specified space
pos_y: float # (um) coordinate in the specified space
pos_z: float # (um) coordinate in the specified space
-> BrainRegion # Assigned brain region
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions as histology table above

Comment on lines +16 to +17
definition = """
# Represents a single histology preparation for a subject
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
definition = """
# Represents a single histology preparation for a subject
definition = # A single histology preparation for a subject

definition = """
# Metadata for a microCT scan of a subject's brain/tissue
-> Subject
scan_id: varchar(32) # User-defined ID (e.g., 'SubjX_OsO4_Scan1')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't suggest including subj id if that's in the pk

Suggested change
scan_id: varchar(32) # User-defined ID (e.g., 'SubjX_OsO4_Scan1')
scan_id: varchar(32) # User-defined ID (e.g., 'OsO4_Scan1')

output_format='TIFF stack': varchar(64) # Format of raw reconstructed data
# --- Data Location & Notes ---
raw_scan_path: varchar(512) # Path to raw output (e.g., folder containing TIFF stack)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR includes a few cases of varchar instead of filepath. We seem to only use the filepath attr for nwb files. It adds some complexity to the config, but saves us a lot of cleanup time with various cleanup functions that look for orphaned files. Moving forward, I think I'd advocate for varchars only in cases where we expect the file to change and don't want to update the checksum

definition = """
# Links MicroCTScan info to the Analysis NWB file containing the image volume
-> MicroCTScan
images_id: varchar(32) # User-defined ID for these images (e.g., scan_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is a varchar pk preferable to the anlysis file being the pk? I think the former has issues with clashing user-defined strings, and the latter is more common across spyglass tables

definition = """
# Stores results/params of aligning microCT image data to a target coordinate system
-> MicroCTImages # Link to the source microCT NWB file info
registration_id: varchar(32) # Unique ID for this registration instance/parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I think I might swap out for int here

---
-> CoordinateSystem # The TARGET coordinate system (e.g., 'allen_ccf_v3_ras_um')
# --- Registration Parameters ---
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, maybe protocol/param tables - it might be the case that a generalized registration table could be a params table for both schemas

@codecov
Copy link

codecov bot commented Jul 7, 2025

Codecov Report

❌ Patch coverage is 9.09091% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.34%. Comparing base (476a648) to head (5fc4e06).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
src/spyglass/histology/v1/histology.py 0.00% 17 Missing ⚠️
src/spyglass/micro_ct/v1/micro_ct.py 0.00% 15 Missing ⚠️
...glass/electrode_localization/localization_merge.py 0.00% 12 Missing ⚠️
...rc/spyglass/electrode_localization/v1/histology.py 0.00% 8 Missing ⚠️
src/spyglass/electrode_localization/v1/micro_ct.py 0.00% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1294      +/-   ##
==========================================
+ Coverage   69.27%   69.34%   +0.06%     
==========================================
  Files         103      108       +5     
  Lines       12336    12597     +261     
==========================================
+ Hits         8546     8735     +189     
- Misses       3790     3862      +72     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@edeno edeno requested a review from Copilot July 7, 2025 11:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds new DataJoint pipelines for microCT, histology, and electrode localization, plus updates to common region definitions.

  • Introduce microCT scan, image, and registration tables (microct_v1 schema).
  • Add histology preparation, stain, image, and registration tables (histology_v1 schema).
  • Implement electrode localization tables for both modalities and a merge table to collate their outputs.
  • Extend common_region with coordinate system lookup and atlas/source fields.

Reviewed Changes

Copilot reviewed 8 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/spyglass/micro_ct/v1/micro_ct.py New tables for microCT scans, images, and registration
src/spyglass/histology/v1/histology.py New tables for histology prep, stains, images, and registration
src/spyglass/electrode_localization/v1/micro_ct.py Manual table for microCT-derived electrode channel locations
src/spyglass/electrode_localization/v1/histology.py Manual table for histology-derived electrode channel locations
src/spyglass/electrode_localization/localization_merge.py Merge table combining histology and microCT channel localizations
src/spyglass/common/common_region.py Enhanced BrainRegion lookup and new BrainCoordinateSystem
src/spyglass/common/init.py Export BrainCoordinateSystem alongside BrainRegion
Comments suppressed due to low confidence (3)

src/spyglass/micro_ct/v1/micro_ct.py:53

  • New computed classes like MicroCTImages lack accompanying unit tests for the make method. Consider adding tests to validate image processing and NWB insertion logic.
@schema

src/spyglass/micro_ct/v1/micro_ct.py:5

  • The import CoordinateSystem does not exist in spyglass.common. It should be replaced with BrainCoordinateSystem or the correct class name from common.
    CoordinateSystem,

src/spyglass/common/common_region.py:57

  • The logger message strings are concatenated without a space before 'Remove any extra spaces...'. Consider adding a space or newline to avoid 'format.Remove' in the output.
                "Please make sure to check the spelling and format."

@edeno edeno changed the title Add Histology, MicroCT, and Electrode Localization Pipelines WIP: Add Histology, MicroCT, and Electrode Localization Pipelines Aug 8, 2025
@edeno edeno marked this pull request as draft August 8, 2025 15:16
@pauladkisson
Copy link
Contributor

Just an FYI - if you're adding electrode localization and would like to import this from nwb, you should consider taking a look at ndx-anatomical-localization, which aims to represent this kind of thing in NWB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants