sciRNAseq3_process_pipeline

Pipeline steps

[Outside of pipeline] Demultiplex with bcl2fastq. After demultiplexing, there will be a pair of FASTQ files for each unique i5/i7 index (typically 96).
Extract cell barcode and UMI from read 1, add to read 2 FASTQ. Validate barcodes against list of RT and ligation barcodes used. By default allows for 1 mismatch (I don't think this is actually implemented, although the variable is set to 1). Output in UMI_attach folder.
Trim A+ sequence from FASTQ files using trim galore. Output in trimmed_fastq folder.
Align with STAR. Output in STAR_alignment.
Filter SAM files, hardcoded to retain reads with mapping quality >=30 and non-multimappers (-q 30 -F 4). Output in filtered_sam.
Remove duplicates—reads with the exact same UMI, barcode and genomic position. Output in rmdup_sam.
Repeat duplicate removal–this time include with up to 1 mismatch in barcode+UMI sequence. Output in rmdup_sam_2.
Splits the deduplicated SAM file into one file per valid barcode, at a given read cut-off. Output in sam_splitted.
Generate gene/exon counts using HTSeq. Output in report/human_mouse_gene_count.
[Outside of pipeline] Combine exonic and intronic reads and produce a count matrix across all cell barcodes and genes.

Original Readme

sci3_primer_sequences_plate.xls: primer sequences for reverse transcription, ligation, and PCR.

sci3_main.sh: main processing script for sci-RNA-seq3.

script_folder: folder for sub-scripts called by sci3_main.sh.

gene_count_processing_sciRNAseq.R: R script for processing the gene count data - the function of “sciRNAseq_gene_count_summary” accepts the gene count folder and then return a cell annotation data frame, a gene annotation data frame and a gene count sparse matrix. The output can be used as input to commonly used single cell RNA-seq analysis packages.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
script_folder		script_folder
.gitignore		.gitignore
README.md		README.md
RT_barcode_embryo_information.csv		RT_barcode_embryo_information.csv
gene_count_processing_sciRNAseq.R		gene_count_processing_sciRNAseq.R
sci3_main-paper-data.sh		sci3_main-paper-data.sh
sci3_main-toronto-exp2.sh		sci3_main-toronto-exp2.sh
sci3_main-toronto.sh		sci3_main-toronto.sh
sci3_main.sh		sci3_main.sh
sci3_primer_sequences_plate.xls		sci3_primer_sequences_plate.xls
sciRNAseq3.Rproj		sciRNAseq3.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

sciRNAseq3_process_pipeline

Pipeline steps

Original Readme

About

Uh oh!

Releases

Packages

Languages

celsiustx/sci-RNA-seq3_pipeline

Folders and files

Latest commit

History

Repository files navigation

sciRNAseq3_process_pipeline

Pipeline steps

Original Readme

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages