The python data pipeline defined with DataJoint for U19 projects
The data pipeline is mainly ingested and maintained with the matlab repository: https://github.com/shenshan/U19-pipeline-matlab
This repository is the mirrored table definitions for the tables in the matlab pipeline.
- Install conda on your system: https://conda.io/projects/conda/en/latest/user-guide/install/index.html
- If running in Windows get git
- (Optional for ERDs) Install graphviz
- Open a new terminal
- Clone this repository:
[email protected]:BrainCOGS/U19-pipeline_python.git- If you cannot clone repositories with ssh, set keys
- Create a conda environment:
conda create -n u19_datajoint_env python==3.7. - Activate environment:
conda activate u19_datajoint_env. (Activate environment each time you use the project) - Change directory to this repository
cd U19_pipeline_python. - Install all required libraries
pip install -e . - Datajoint Configuration:
jupyter notebook notebooks/00-datajoint-configuration.ipynb
We have created some tutorial notebooks to help you start working with datajoint
- Querying data (Strongly recommended)
jupyter notebook notebooks/tutorials/1-Explore U19 data pipeline with DataJoint.ipynb
- Building analysis pipeline (Recommended only if you are going to create new databases or tables for analysis)
jupyter notebook notebooks/tutorials/2-Analyze data with U19 pipeline and save results.ipynbjupyter notebook notebooks/tutorials/3-Build a simple data pipeline.ipynb
Ephys element and imaging element require root paths for ephys and imaging data. Here are the notebooks showing how to set up the configurations properly.
Currently, the main schemas in the data pipeline are as follows:
- lab
- reference
- subject
- action
- acquisition
- task
- behavior
Behavior data for Towers task.
- ephys_element
Ephys related tables were created with DataJoint Element Array Ephys, processing ephys data aquired with SpikeGLX and pre-processed by Kilosort2.
-
imaging Imaging pipeline processed with customized algorithm for motion correction and CNMF for cell segmentation in matlab.

-
scan_element and imagine_element
Scan and imaging tables created with DataJoint Element Calcium Imaging, processing imaging data acquired with Scan Image and pre-processed by Suite2p.
For all code below, I am assuming datajoint has been imported like:
import datajoint as djdj.Table._update(schema.Table & key, 'column_name', 'new_data')
table.heading.attributes.keys()
This also works on a query object:
schema = dj.create_virtual_module("some_schema","some_schema")
query_object = schema.Sample() & 'sample_name ="test"'
query_object.heading.attributes.keys()







