Skip to content

Alvalunasan/U19-pipeline_python

 
 

Repository files navigation

U19-pipeline_python

The python data pipeline defined with DataJoint for U19 projects

The data pipeline is mainly ingested and maintained with the matlab repository: https://github.com/shenshan/U19-pipeline-matlab

This repository is the mirrored table definitions for the tables in the matlab pipeline.

Installation

Prerequisites (for recommended conda installation)

  1. Install conda on your system: https://conda.io/projects/conda/en/latest/user-guide/install/index.html
  2. If running in Windows get git
  3. (Optional for ERDs) Install graphviz

Installation with conda

  1. Open a new terminal
  2. Clone this repository: [email protected]:BrainCOGS/U19-pipeline_python.git
    • If you cannot clone repositories with ssh, set keys
  3. Create a conda environment: conda create -n u19_datajoint_env python==3.7.
  4. Activate environment: conda activate u19_datajoint_env. (Activate environment each time you use the project)
  5. Change directory to this repository cd U19_pipeline_python.
  6. Install all required libraries pip install -e .
  7. Datajoint Configuration: jupyter notebook notebooks/00-datajoint-configuration.ipynb

Tutorials

We have created some tutorial notebooks to help you start working with datajoint

  1. Querying data (Strongly recommended)
  • jupyter notebook notebooks/tutorials/1-Explore U19 data pipeline with DataJoint.ipynb
  1. Building analysis pipeline (Recommended only if you are going to create new databases or tables for analysis)
  • jupyter notebook notebooks/tutorials/2-Analyze data with U19 pipeline and save results.ipynb
  • jupyter notebook notebooks/tutorials/3-Build a simple data pipeline.ipynb

Ephys element and imaging element require root paths for ephys and imaging data. Here are the notebooks showing how to set up the configurations properly.

Major schemas

Currently, the main schemas in the data pipeline are as follows:

  • lab

Lab Diagram

  • reference

Reference Diagram

  • subject

Subject Diagram

  • action

Action Diagram

  • acquisition

Acquisition Diagram

  • task

Task Diagram

  • behavior

Behavior data for Towers task.

Behavior Diagram

  • ephys_element

Ephys related tables were created with DataJoint Element Array Ephys, processing ephys data aquired with SpikeGLX and pre-processed by Kilosort2.

Ephys Diagram

  • imaging Imaging pipeline processed with customized algorithm for motion correction and CNMF for cell segmentation in matlab. Imaging Diagram

  • scan_element and imagine_element

Scan and imaging tables created with DataJoint Element Calcium Imaging, processing imaging data acquired with Scan Image and pre-processed by Suite2p.

Scan element and imaging element Diagram

Undocumented datajoint features

For all code below, I am assuming datajoint has been imported like:

import datajoint as dj

Update a table entry

dj.Table._update(schema.Table & key, 'column_name', 'new_data')

Get list of all column names in a table (without having to issue a query or fetch)

table.heading.attributes.keys()

This also works on a query object:

schema = dj.create_virtual_module("some_schema","some_schema")
query_object = schema.Sample() & 'sample_name ="test"'
query_object.heading.attributes.keys()

About

The python data pipeline defined with DataJoint for U19 projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 98.3%
  • Python 1.7%