ezpore

Authors: Ids Willemsen, Robbert van Himbeeck (shared first-author)

⚠️ Disclaimer: The 18S nematode database is provided as-is, without warranties. Use of ezpore constitutes agreement with the Terms of Use.

We currently recommend using Vsearch over emu. For more information please check this thread: treangenlab/emu#74

About

ezpore is a single-command pipeline to process bacterial (full 16S), fungal (full ITS) or Nematodal (full 18S) reads obtained by Nanopore sequencing.

ezpore can perform following steps:

demultiplexing (dorado, barcoding kit EXP-NBD196)
filtering on length and quality (NanoFilt)
primer trimming (cutadapt)
ITS region extraction (ITSxpress, for fungal ITS)
cluster reads (vsearch)
read classification (emu/vsearch)

Installation & prerequisites

ezpore is developed for Linux operating systems and will likely also work on other Unix-like OS (e.g. MacOS).

Usage on Windows is not supported, however Windows Subsystem for Linux (WSL) can be used (see section "WSL installation instructions").

Be aware that running on windows takes way longer than on a Linux machine!

WSL installation instructions

Follow these steps to install Windows Subsystem for Linux (WSL) on a Windows 10 or 11 machine.

Open PowerShell as Administrator: Press Windows + X and select Windows PowerShell (Admin) or Windows Terminal (Admin).
Install WSL: Run the following command in the PowerShell window:

wsl --install

Prerequisite installation instructions

To use ezpore, conda and snakemake need to be installed on your system. To first install conda perform following steps:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh #

bash Miniconda3-latest-Linux-x86_64.sh #

navigate trough the interactive installation shell: Choose yes by “Do you wish the installer to initialize Miniconda3 by running conda init? [yes|no]"

open a new terminal, (base) will appearing at the beginning of every rule.
Install snakemake with the following command:

conda install -c conda-forge -c bioconda snakemake

Downloading and running ezpore

To 'install' ezpore:

clone the repository into your local directory using git clone https://github.com/ids-willemsen/ezpore.git or download the ezpore.zip file from the github and extract it to the directory of your choice. The ezpore.zip contains all files necessary for your run.
copy your sequencing file (fastq) to the same folder that contains the extracted ezpore.zip file, or cd to the ezpore folder.
In the case of demultiplexed data: in the extraction folder containing the ezpore.zip contents, create a folder called 'demux' and copy your demultiplexed files to there.
Edit the settingsfile.yaml to correspond with your preferred run settings - arguments are explained below.
Edit the barcode_files.txt to only contain barcodes you want to be analyzed - in the case this file is empty or not present, the ezpore pipeline will use all files.
Before running ezpore your folder should contain the following files:
1. snakefile.smk
2. settingsfile.yaml
3. barcode_files.txt
4. ezpore_conda.yaml
5. A non-demultiplexed fastq or a folder called 'demux' containing demultiplexed files!
Finally, run the ezpore pipeline with the command, consider changing the classifier slots and cores based on your available RAM/threads:

snakemake --snakefile snakefile.smk --use-conda --cores all --resources classifier_slots=6

The settingsfile

The settingsfile.yaml contains all possible arguments that can be used by ezpore.

The settingsfile.yaml includes following arguments:

argument	description	input type	default value
keep_steps	option to delete steps to avoid directory bloating	True/False	True
demultiplex	demultiplexes the data using dorado	True/False	True
min	the minimum read length (in bp). Shorter reads are removed	integer	100
max	the maximum read length (in bp). Larger reads are removed	INTEGER	10000
quality	the minimum average read quality to be retained. Reads with lower Q score are removed	INTEGER	15
trim_primers	removes primers using cutadapt	True/False	False
primer_error_rate	the maximum allowed error rate for primer trimming.	UNIT INTERVAL[0-1}	0.2
clustering	clusters sequences using vsearch	True/False	False
cluster_perc	the percentage identity to cluster on using vsearch.	UNIT INTERVAL[0-1}	0.97
threads	the number of threads are used throughout the pipeline	INTEGER	24
group	the group of organisms: bacteria (16S_bac), nematodes (18S_nem) or fungi (ITS_fun)	STRING	none
barcode_file	the path to your barcode_files.txt	STRING	none
input_file	the input file (.fastq) of the analysis, in case demultiplexing is not performed you leave this empty like `""`	.fastq file	none
forward_primer	the primer sequence of your forward primer	STRING	none
forward_primer	the primer sequence of your reverse primer	STRING	none
classifier	the classifier you would like to use for taxonomic identification	emu/vsearch	none
min_abundance	the minimum relative abundance of an organism to be retained by emu	UNIT INTERVAL[0-1]	0.0001
rank	the taxonomic rank which emu uses to combine output of all files	species,genus, etc.	species
vsearch_id	the minimum percent identity that vsearch will include in your output	0.97
custom_database	use a custom database	True/False	False
custom_database_path	The path to your custom database	none

Using a custom database

ezpore is equipped to automatically download the 16S SILVA database for bacteria, the UNITE ITS database for fungi, and our in-house 18S Nematode database for both 'emu' and 'vsearch' classification. If you prefer to use your own database, this is possible by changing the custom_database argument to 'True' and adding the database path to custom_database_path. In this case, the database files should be present should be in the correct format as used by vsearch/emu. If you choose to use emu, the custom_database_path should lead to a directory containing the taxonomy.tsv and species_taxid.fasta. If you choose to use vsearch, the custom_database_path should lead to a .fasta file vsearch database format. Mind that the group should still be set in the settingsfile (e.g. in the case you want to use ITS extraction set it to ITS_fun or if you would like to trim primers use 16S_bac/18S_nem).

Output

Emu

When using emu, the results folder contains taxonomic identification tables for each barcode, and two combined OTU tables containing either relative abundance or total counts. Emu combines clustered OTUs when they are identified to the same taxonomic group. If you want to keep each OTU separate, we recommend using vsearch.

Vsearch

When using vsearch, the results folder contains a combined OTU table with taxonomy where the total reads per barcode and the taxonomic group of an OTU is given. If you wish to check the sequence of an OTU, these can be found in vsearch_input/otus_renamed.fasta.

Bugs and requests

If you encounter any bugs or you wish to request additional features, please open an issue on this GitHub page.

Acknowledgements

We want to thank the creators of Decona (https://github.com/Saskia-Oosterbroek/decona) for their advice in the early stages of creating the ezpore pipeline.

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
.snakemake/log		.snakemake/log
Diagram.png		Diagram.png
LICENSE		LICENSE
README.md		README.md
TERMS_OF_USE.md		TERMS_OF_USE.md
barcode_files.txt		barcode_files.txt
ezpore.zip		ezpore.zip
ezpore_conda.yaml		ezpore_conda.yaml
ezpore_logo.png		ezpore_logo.png
settingsfile.yaml		settingsfile.yaml
snakefile.smk		snakefile.smk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ezpore

About

Installation & prerequisites

WSL installation instructions

Prerequisite installation instructions

Downloading and running ezpore

The settingsfile

Using a custom database

Output

Emu

Vsearch

Bugs and requests

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ezpore

About

Installation & prerequisites

WSL installation instructions

Prerequisite installation instructions

Downloading and running ezpore

The settingsfile

Using a custom database

Output

Emu

Vsearch

Bugs and requests

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages