Skip to content

Commit b50fa85

Browse files
samuelbray32CBroz1Copilot
authored
Standardize nwb ingestion (LorenFrankLab#1377)
* initial version of standard import steps * add new standard for ingesting configs * debug and make SpyglassIngestion class * transition CameraDevice for new mixin * cleanup commented lines * accomodate tables that insert into parts in mixin * transition common lab ingestion to new standard * adjust calls in Session population for new standard * remove commented lines * migrate table to new ingestion format * move subject to new import * add validation of secondary keys for duplicates * update Probe tables to new ingestion * for table objects, have defaut get entries run on each row * move interval list to new ingestion * Move table inserts out of Session.make and run in populate_all_common * Migrate session to new ingestion scheme * update method call in test_lab_member_insert_file_str * restore imports for table definitions * add lab institute and subject data to mock dio file * spelling * initial version of auto doc markdown table * Auto-gen mapping doc * Refactor SpyglassIngestion * execute_inserts -> dry_run * Handle inserts as dict for parts. Quiet ingest logging * Minor edits * Add load config * Prevent prompt on equivalent entries. Always handle entries as dict * Fix ingestion order * Add DIOEvents * Remove debug print * Pluralize 'adjust_keys'. Add Raw * Update changelog * Fix tests * Remove comment * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Apply suggestions from code review 2 * IntervalList.insert -> cautious_insert * Fix recursive call * Fix restrict by UUID * Standardize import - re-opened PR (LorenFrankLab#1423) * Auto-gen mapping doc * Refactor SpyglassIngestion * execute_inserts -> dry_run * Handle inserts as dict for parts. Quiet ingest logging * Minor edits * Add load config * Prevent prompt on equivalent entries. Always handle entries as dict * Fix ingestion order * Add DIOEvents * Remove debug print * Pluralize 'adjust_keys'. Add Raw * Update changelog * Fix tests * Remove comment * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Apply suggestions from code review 2 * IntervalList.insert -> cautious_insert * Fix recursive call * Fix restrict by UUID --------- Co-authored-by: Copilot <[email protected]> * Revise based on code review * `interval_name_from_tags` fallback * WIP: fix boolean condition of index==0 * WIP: fix interval typing * cleanup mixin merge * rename to IngestionMixin and move to mixin folder * fix import location * update docs * spelling fix * change to SpyglassIngestion = SpyglassMixin + IngestionMixin * update docs yaml --------- Co-authored-by: CBroz1 <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: Chris Broz <[email protected]>
1 parent 33f30f2 commit b50fa85

28 files changed

+1743
-1125
lines changed

CHANGELOG.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,11 @@ import all foreign key references.
3030
- Fix bug in TaskEpoch.make so that it correctly handles multi-row task tables
3131
from NWB #1433
3232
- Split `SpyglassMixin` into task-specific mixins #1435 #1451
33-
34-
### Infrastructure
35-
3633
- Auto-load within-Spyglass tables for graph operations #1368
3734
- Allow rechecking of recomputes #1380, #1413
35+
- Set default codecov threshold for test fail, disable patch check #1370, #1372
36+
- Simplify PR template #1370
37+
- Add `SpyglassIngestion` class to centralize functionality #1377, #1423
3838

3939
### Pipelines
4040

@@ -43,6 +43,14 @@ import all foreign key references.
4343
- Common
4444
- Add tables for storing optogenetic experiment information #1312
4545
- Remove wildcard matching in `Nwbfile().get_abs_path` #1382
46+
- Change `IntervalList.insert` to `cautious_insert` #1423
47+
- Allow email send on space check success, clean up maintenance logging #1381
48+
- Update pynwb pin to >=2.5.0 for `TimeSeries.get_timestamps` #1385
49+
- Fix error from unlinked object in `AnalysisNwbfile.create` #1396
50+
- Sort `UserEnvironment` dict objects by key for consistency #1380
51+
- Fix typo in VideoFile.make #1427
52+
- Fix bug in TaskEpoch.make so that it correctly handles multi-row task
53+
tables from NWB #1433
4654
- Add custom/dynamic `AnalysisNwbfile` creation #1435
4755
- Decoding
4856
- Ensure results directory is created if it doesn't exist #1362

docs/mkdocs.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ nav:
7979
- Export: Features/Export.md
8080
- Centralized Code: Features/Mixin.md
8181
- Recompute: Features/Recompute.md
82+
- Ingestion: Features/Ingestion.md
8283
- For Developers:
8384
- Overview: ForDevelopers/index.md
8485
- How to Contribute: ForDevelopers/Contribute.md
@@ -89,6 +90,7 @@ nav:
8990
- Understanding a Schema: ForDevelopers/Schema.md
9091
- Custom Pipelines: ForDevelopers/CustomPipelines.md
9192
- Using NWB: ForDevelopers/UsingNWB.md
93+
- Ingestion Mapping: ForDevelopers/ingestion_mapping.md
9294
- API Reference: api/ # defer to gen-files + literate-nav
9395
- Change Log: CHANGELOG.md
9496
- Copyright: LICENSE.md
@@ -127,6 +129,7 @@ plugins:
127129
- gen-files:
128130
scripts:
129131
- ./src/api/make_pages.py
132+
- ./src/api/ingestion_mapping_generate.py
130133
- mkdocs-jupyter: # Comment this block during dev to reduce build time
131134
execute: False # Very slow, needs gh-action edit to work/link to db
132135
include_source: False

docs/src/Features/Ingestion.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Ingestion Process
2+
3+
## Step 0: NWB files
4+
5+
Before beginning with spyglass, data must be compiled into the standardized NWB
6+
file format. NWB files contain everything about the experiment and form the starting
7+
point of all analyses. Numerous [online tutorials](https://nwb.org/converting-data-to-nwb/)
8+
exist to help get you started in this process, as well as existing packages for
9+
lab-specific conversions ([1](https://github.com/catalystneuro),
10+
[2](https://github.com/LorenFrankLab/trodes_to_nwb)) that can be used as a reference.
11+
12+
The following sections describe how data is extracted from these files and brought
13+
into the spyglass system. For best compatibility, please use these as reference when
14+
creating your NWB files.
15+
16+
## What is ingestion?
17+
18+
Ingestion is the process of extracting data from the raw NWB file and storing it
19+
in Spyglass tables.
20+
21+
## How does it work?
22+
23+
### For users
24+
25+
For most users all you'll need to do is call `spyglass.common.insert_sessions(nwb_file_name)`
26+
which will iterate through tables populated from the raw NWB file and create
27+
appropriate entries.
28+
29+
### In the background
30+
31+
*Note: Migration to this format is in progress, and not yet implemented for all
32+
ingestion tables
33+
34+
Tables that are populated from the raw NWB file are instances of the `SpyglassIngestion`
35+
class. Tables of this class must define the following properties which enable finding
36+
relevant data in the NWB file and creating corresponding table entries.
37+
38+
- `_source_nwb_object_type`: defines the `pynwb` object type containing data for
39+
this table (eg. `pynwb.misc.Units` for `ImportedSpikesorting`).
40+
- `table_key_to_obj_attr`: A dict of dicts mapping table keys to NWB object attributes
41+
or callable methods that generate the value to store from the nwb object.
42+
43+
With these defined the table entries are populated from the following methods:
44+
45+
- `insert_from_nwbfile`: top-level function that extracts and inserts all entries
46+
for the table
47+
- `get_nwb_objects`: returns all nwb objects from the raw file containing data for
48+
the table. By default, returns all instances of `_source_nwb_object_type`, but
49+
can be overwritten on a per-table basis for more selective restriction
50+
- `generate_entries_from_nwb_object`: Called for each identified nwb object. Generates
51+
table entries using the `table_key_to_obj_attr` mapping.
52+
53+
## NWB to Spyglass table mappings
54+
55+
*In progress*: To aid in creating spyglass-compatable NWB files, we provide a
56+
[Reference Table](../ForDevelopers/ingestion_mapping.md) which maps spyglass table
57+
entries to the source nwb objects and attributes.
58+
59+
For entries not yet contained in the updated format, a complete list of this mappings
60+
can also be found [here](../ForDevelopers/UsingNWB.md).

docs/src/ForDevelopers/Classes.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,13 @@ These mixins form a dependency chain for NWB file fetching and export logging:
122122
- Handles copy-to-common for custom AnalysisNwbfile tables
123123
- See [Export Guide](../Features/Export.md)
124124

125+
***IngestionMixin* (`mixins/ingestion.py`)
126+
127+
- Defines a protocol for populating table entries from the raw nwb file
128+
- Provides `insert_from_nwbfile()` which identifies relevant objects within the
129+
nwb file and creates table entries
130+
- See [Ingestion Guide](../Features/Ingestion.md)
131+
125132
---
126133

127134
## Core Composite Classes
@@ -234,6 +241,50 @@ blocking each other.
234241

235242
**See also:** [Custom Analysis Tables](./Management.md#custom-analysis-tables)
236243

244+
### SpyglassIngestion
245+
246+
**Location**: `src/spyglass/utils/dj_mixin.py`
247+
248+
**Purpose**: Specialized mixin for generating table entries from raw nwb files
249+
250+
**Inherits from**:
251+
252+
- `SpyglassMixin` (all 5 base mixins)
253+
- `IngestionMixin` (ingestion operations)
254+
255+
**Additional Functionality**:
256+
257+
- Defines `insert_from_nwbfile()` which identifies source data in the nwb file and
258+
translates into table entries. Depends on defining the following properties for
259+
each class:
260+
- `_source_nwb_object_type`: The `pynwb` type of object(s) in the file containing
261+
data for the given table.
262+
- `_source_nwb_object_name`: OPtional property which further limits ingestion
263+
to nwb_objects with this name attribute
264+
- `table_key_to_obj_attr`: A dictionary which defines a mapping from spyglass
265+
table column to the name of the nwb object attribute to be stored.
266+
Optionally, a callable function which generates the value to be stored from
267+
the nwb object can be used instead of the attribute name.
268+
269+
**Usage**:
270+
271+
```python
272+
@schema
273+
class MyIngestionTable(SpyglassIngestion, dj.Manual):
274+
definition = """
275+
-> Session
276+
----
277+
lfp_obj_id: varchar(32)
278+
"""
279+
@property
280+
def _source_nwb_object_type(self):
281+
return pynwb.ecephys.LFP
282+
283+
@property
284+
def table_key_to_obj_attr(self):
285+
return {"self": {"lfp_obj_id": "object_id"}}
286+
```
287+
237288
---
238289

239290
## Usage Patterns
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Spyglass Ingestion Mapping
2+
3+
| module | class | source_nwb_object_type | object_selector | table_key | maps_to | is_callable |
4+
|---|---|---|---|---|---|---|
5+
| spyglass.common.common_device | CameraDevice | CameraDevice | self | camera_id | CameraDevice.get_camera_id | True |
6+
| spyglass.common.common_device | CameraDevice | CameraDevice | self | camera_name | camera_name | False |
7+
| spyglass.common.common_device | CameraDevice | CameraDevice | self | lens | lens | False |
8+
| spyglass.common.common_device | CameraDevice | CameraDevice | self | manufacturer | manufacturer | False |
9+
| spyglass.common.common_device | CameraDevice | CameraDevice | self | meters_per_pixel | meters_per_pixel | False |
10+
| spyglass.common.common_device | CameraDevice | CameraDevice | self | model | model | False |
11+
| spyglass.common.common_device | DataAcquisitionDevice | DataAcqDevice | self | adc_circuit | adc_circuit | False |
12+
| spyglass.common.common_device | DataAcquisitionDevice | DataAcqDevice | self | data_acquisition_device_amplifier | amplifier | False |
13+
| spyglass.common.common_device | DataAcquisitionDevice | DataAcqDevice | self | data_acquisition_device_name | name | False |
14+
| spyglass.common.common_device | DataAcquisitionDevice | DataAcqDevice | self | data_acquisition_device_system | system | False |
15+
| spyglass.common.common_device | DataAcquisitionDeviceAmplifier | DataAcqDevice | self | data_acquisition_device_amplifier | amplifier | False |
16+
| spyglass.common.common_device | DataAcquisitionDeviceSystem | DataAcqDevice | self | data_acquisition_device_system | system | False |
17+
| spyglass.common.common_device | Probe | Probe | self | contact_side_numbering | Probe.contact_side_numbering_as_string | True |
18+
| spyglass.common.common_device | Probe | Probe | self | probe_id | probe_type | False |
19+
| spyglass.common.common_device | Probe | Probe | self | probe_type | probe_type | False |
20+
| spyglass.common.common_device | ProbeType | Probe | self | manufacturer | manufacturer | False |
21+
| spyglass.common.common_device | ProbeType | Probe | self | num_shanks | ProbeType.get_num_shanks | True |
22+
| spyglass.common.common_device | ProbeType | Probe | self | probe_description | probe_description | False |
23+
| spyglass.common.common_device | ProbeType | Probe | self | probe_type | probe_type | False |
24+
| spyglass.common.common_interval | IntervalList | TimeIntervals | self | interval_list_name | IntervalList.interval_name_from_tags | True |
25+
| spyglass.common.common_interval | IntervalList | TimeIntervals | self | valid_times | IntervalList.interval_from_start_stop_time | True |
26+
| spyglass.common.common_lab | Institution | NWBFile | self | institution_name | institution | False |
27+
| spyglass.common.common_lab | Lab | NWBFile | self | lab_name | lab | False |
28+
| spyglass.common.common_session | Session | NWBFile | self | experiment_description | experiment_description | False |
29+
| spyglass.common.common_session | Session | NWBFile | self | institution_name | institution | False |
30+
| spyglass.common.common_session | Session | NWBFile | self | lab_name | lab | False |
31+
| spyglass.common.common_session | Session | NWBFile | self | session_description | session_description | False |
32+
| spyglass.common.common_session | Session | NWBFile | self | session_id | session_id | False |
33+
| spyglass.common.common_session | Session | NWBFile | self | session_start_time | session_start_time | False |
34+
| spyglass.common.common_session | Session | NWBFile | self | timestamps_reference_time | timestamps_reference_time | False |
35+
| spyglass.common.common_session | Session | NWBFile | subject | subject_id | subject_id | False |
36+
| spyglass.common.common_subject | Subject | Subject | self | age | age | False |
37+
| spyglass.common.common_subject | Subject | Subject | self | description | description | False |
38+
| spyglass.common.common_subject | Subject | Subject | self | genotype | genotype | False |
39+
| spyglass.common.common_subject | Subject | Subject | self | species | Subject.standardized_sex_string | True |
40+
| spyglass.common.common_subject | Subject | Subject | self | subject_id | subject_id | False |
41+

0 commit comments

Comments
 (0)