Skip to content

Commit 5909724

Browse files
committed
Add QC & trimming workflow
1 parent 79c62f9 commit 5909724

File tree

5 files changed

+452
-0
lines changed

5 files changed

+452
-0
lines changed
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
version: 1.2
2+
workflows:
3+
- name: main
4+
subclass: Galaxy
5+
publish: true
6+
primaryDescriptorPath: /raw-reads-quality-control-and-trimming.ga
7+
testParameterFiles:
8+
- /raw-reads-quality-control-and-trimming-tests.yml
9+
authors:
10+
- name: "B\xE9r\xE9nice Batut"
11+
orcid: 0000-0001-9852-1987
12+
- name: Paul Zierep
13+
orcid: 0000-0003-2982-388X
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Changelog
2+
3+
## [0.1] 2025-10-07
4+
5+
First release.
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Raw reads quality control and trimming
2+
3+
Before starting any analysis, it is always a good idea to assess the quality of the input data and to discard poor-quality base content by trimming and filtering reads.
4+
5+
This workflow takes paired-end Illumina fastq(.gz) files and executes the following steps:
6+
1. Quality control and trimming using **fastp**
7+
2. Aggregation of the quality control reports using **MultiQC**
8+
9+
## Input Datasets
10+
11+
- A list of paired datasets corresponding to paired-end raw reads in `fastqsanger` or `fastqsanger.gz` format.
12+
- Qualified quality score: The quality value that a base is qualified to have.
13+
- Minimal read length: Reads shorter than this value will be discarded.
14+
- Cutting mean quality: The bases in the sliding window with mean quality below this value will be cut.
15+
- [Optional] Adapter to remove on forward reads and reverse reads
16+
17+
## Output Datasets
18+
19+
- A list of paired datasets corresponding to paired-end **trimmed** reads in `fastqsanger` or `fastqsanger.gz`, ready for further analysis.
20+
- List of `JSON` reports of fastp for each sample that could be used as inputs for extra MultiQC
21+
- MultiQC report in HTML
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
- doc: Test outline for Raw-reads-quality-control-and-trimming
2+
job:
3+
Raw reads:
4+
class: Collection
5+
collection_type: list:paired
6+
elements:
7+
- class: Collection
8+
type: paired
9+
identifier: pair
10+
elements:
11+
- class: File
12+
identifier: forward
13+
location: https://zenodo.org/records/11484215/files/paired_r1.fastq.gz
14+
filetype: fastqsanger.gz
15+
- class: File
16+
identifier: reverse
17+
location: https://zenodo.org/records/11484215/files/paired_r2.fastq.gz
18+
filetype: fastqsanger.gz
19+
Adapter to remove on forward reads: null
20+
Adapter to remove on reverse reads: null
21+
Qualified quality score: '15'
22+
Minimal read length: '15'
23+
Cutting mean quality: '15'
24+
outputs:
25+
multiqc_html_report:
26+
asserts:
27+
has_text:
28+
text: "Filtered Reads"
29+
has_text:
30+
text: "pair"
31+
fastp_report_json:
32+
element_tests:
33+
pair:
34+
asserts:
35+
has_text:
36+
text: "paired end (301 cycles + 301 cycles)"
37+
has_text:
38+
text: "300000"
39+
has_text:
40+
text: "295680"
41+
has_text:
42+
text: "59230948"

0 commit comments

Comments
 (0)