Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
version: 1.2
workflows:
- name: main
subclass: Galaxy
publish: true
primaryDescriptorPath: /short-reads-quality-control-and-trimming.ga
testParameterFiles:
- /short-reads-quality-control-and-trimming-tests.yml
authors:
- name: "B\xE9r\xE9nice Batut"
orcid: 0000-0001-9852-1987
- name: Paul Zierep
orcid: 0000-0003-2982-388X
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Changelog

## [0.1] 2025-10-07

First release.
21 changes: 21 additions & 0 deletions workflows/read_preprocessing/short-reads-qc-trimming/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Short-reads quality control and trimming

Before starting any analysis, it is always a good idea to assess the quality of the input data and to discard poor-quality base content by trimming and filtering reads.

This workflow takes paired-end Illumina (**short-reads**) fastq(.gz) files and executes the following steps:
1. Quality control and trimming using **fastp**
2. Aggregation of the quality control reports using **MultiQC**

## Input Datasets

- A list of paired datasets corresponding to paired-end raw reads in `fastqsanger` or `fastqsanger.gz` format.
- Qualified quality score: The quality value that a base is qualified to have.
- Minimal read length: Reads shorter than this value will be discarded.
- Cutting mean quality: The bases in the sliding window with mean quality below this value will be cut.
- [Optional] Adapter to remove on forward reads and reverse reads

## Output Datasets

- A list of paired datasets corresponding to paired-end **trimmed** reads in `fastqsanger` or `fastqsanger.gz`, ready for further analysis.
- List of `JSON` reports of fastp for each sample that could be used as inputs for extra MultiQC
- MultiQC report in HTML
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
- doc: Test outline for short-reads-quality-control-and-trimming
job:
Raw reads:
class: Collection
collection_type: list:paired
elements:
- class: Collection
type: paired
identifier: pair
elements:
- class: File
identifier: forward
location: https://zenodo.org/records/11484215/files/paired_r1.fastq.gz
filetype: fastqsanger.gz
- class: File
identifier: reverse
location: https://zenodo.org/records/11484215/files/paired_r2.fastq.gz
filetype: fastqsanger.gz
Adapter to remove on forward reads: null
Adapter to remove on reverse reads: null
Qualified quality score: '15'
Minimal read length: '15'
Cutting mean quality: '15'
outputs:
multiqc_html_report:
asserts:
has_text:
text: "Filtered Reads"
has_text:
text: "pair"
fastp_report_json:
element_tests:
pair:
asserts:
has_text:
text: "paired end (301 cycles + 301 cycles)"
has_text:
text: "300000"
has_text:
text: "295680"
has_text:
text: "59230948"
Loading