Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

README.md

fastp container

Main tool : fastp

Code repository: https://github.com/OpenGene/fastp

Additional tools:

  • jq: 1.7

Basic information on how to use this tool:

  • executable: fastp
  • help: -? , --helpfastp
  • version: -v , --version
  • description: A tool designed to provide ultrafast all-in-one preprocessing and quality control for FastQ data.

Additional information:

This tool is not meant for usage with long read data (e.g. Nanopore, PacBio, Cyclone). This tool is meant for processing short reads for FASTQ files generated by tools including Illumina NovaSeq and MGI.

Inputs can be presented as files, in a batch or individually, or from STDIN. Output can be pushed to a file or STDOUT.

Threading can be done with -w, --thread. The default worker thread number is 3.

Shifu Chen. 2023. Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp. iMeta 2: e107. https://doi.org/10.1002/imt2.107

Shifu Chen, Yanqing Zhou, Yaru Chen, Jia Gu; fastp: an ultra-fast all-in-one FASTQ preprocessor, Bioinformatics, Volume 34, Issue 17, 1 September 2018, Pages i884–i890, https://doi.org/10.1093/bioinformatics/bty560

Full documentation: https://github.com/OpenGene/fastp All options as presented in --help: https://github.com/opengene/fastp?tab=readme-ov-file#all-options

Example reports can be seen here:

Example Usage

Single End data (uncompressed)

fastp -i in.fq -o out.fq

Paired End Data

fastp -i SRR13957123_1.fastq.gz -I SRR13957123_2.fastq.gz -o SRR13957123_PE1.fastq.gz -O SRR13957123_PE2.fastq.gz -h SRR13957123_fastp.html -j SRR13957123_fastp.json

Batch Processing

python parallel.py -i /path/to/input/folder -o /path/to/output/folder -r /path/to/reports/folder -a '-f 3 -t 2'

which means to:

  • process all the FASTQ data in /path/to/input/folder
  • using fastp in PATH
  • with arguments -f 3 and -t 2, which means trimming 3bp in head and 2bp in tail
  • output all clean data to /path/to/output/folder
  • output all HTML and JSON reports to /path/to/reports/folder