Analysis of anthelmintic treatment dataset

Code accompanying the manuscript "Microbiome responses to anthelmintic treatment depend on pre-treatment helminth infection status in young Ethiopian children".

Abstract: Mass deworming programs using effective anthelmintic drugs are essential for controlling soil-transmitted helminth (STH) infections, particularly in high-risk developing regions. However, it remains unclear whether routine deworming induces long-term alterations in gut microbiome composition, especially when accounting for individual infection status. This study aims to explore the changes in the gut microbiomes of Ethiopian children one year after the administration of an anthelmintic treatment.

Contributors

Polina Tikhonova, MSc

Overview

project
|- README         # repository descriptions
|- LICENSE        # the license for this project
|
|- data           # project data
| |- database/    # SILVA database directory
| |- reads/       # a directory for fastq files 
| |- raw/      # a directory for the raw reads
| +- filtered/ # (generated by the pipeline)
|                 # a directory, containing filtered reads
| |- QCcontrol    # (generated by the pipeline); QC files
| |- dada2        # (generated by the pipeline); dada2 files
| |- phyloseq     # (generated by the pipeline; provided);
| |               # resulting phyloseq object files in .rds and .csv formats
| +- metadata.tsv # a metadata file, provided
|
|- code/          # project codes
| |- analysis/    # analysis, generates results/ files
| +- processing/  # raw data processing, generates data/proccessed/ files
|
|- results        # output files, generated by the codes in analysis folder
| |- figures/     # manuscript figures
| +- tables/      # beta-diversity tables

Code descriptions

Please note that the codes in this repository were tested on Linux x86_64 system.

Data Processing

Dependencies

snakemake (v7.32.1)
conda install -c bioconda -c conda-forge snakemake==7.32.1

Instructions

Please note that running the processing pipeline is optional since the final phyloseq objects necessary for the analysis are provided in this repository.

Install dependencies (~10 min). In case of version incompatibility errors, please set the conda channel_priority to flexible.
The raw data is publicly available at the European Nucleotide Archive (ENA) under accession number PRJEB93790. Please download the fastq.gz files to the data/reads/raw directory and follow the proposed directory structure: sample_name/sample_name_R*.fastq.qz. The metadata file should be saved as a data/metadata.tsv file (a tab-separated format).
Download Silva database version 138 to the data/database directory. Please, make sure to download both files: silva_nr99_v138.2_toGenus_trainset.fa.gz and silva_v138.2_assignSpecies.fa.gz.
(optional) In case of custom paths to the raw files and database, please modify the corresponding parameters in the code/processing/config.yaml file.
Run Snakemodule

mkdir code/processing/logs
cd code/processing/logs
conda activate snakemake
snakemake --snakefile ../Snakemodule --use-conda --conda-frontend conda

Suggested parameters for running the Snakemodule using multiple sbatch jobs: --cluster "sbatch --time=47:00:00 --nodes=1 --ntasks=20 --mem=200GB" --jobs 10.

Pipeline files

codes/processing
|
|- Snakemodule
|- conda_env.*.yaml                      # conda environment libraries
|- config.yaml                           # Snakemodule paths and settings
+- params.dada2.filter_and_trim.yaml     # dada2 filter_and_trim settings

Snakemodule Steps

FastQC/MultiQC quality assessment
- 1.quality_control.smk
Trimming and filtering
- 2.filtering_trimming.smk
FastQC/MultiQC quality reassessment
- 1.quality_control.smk
ASV identification using DADA2 pipeline (official tutorial)
- 3.dada2.smk
ASV taxonomy annotation
- 3.dada2.smk
Generation of the phyloseq object.
- 3.dada2.smk
- outputs:
  - unfiltered phyloseq (all ASVs)
  - filtered phyloseq (only microbial ASVs)
  - filtered phyloseq objects agglomerated at genus and family levels

Analysis

Dependencies

Create and install a new conda environment (~30 min)

conda create -n anthelminthic_treatment
conda update -n anthelminthic_treatment --file code/analysis/conda_env.R.yaml

Install additional R libraries:

conda activate anthelminthic_treatment
R
devtools::install_github(repo = "malucalle/selbal", ref="9f7ff2b")
devtools::install_github(repo = "gauravsk/ranacapa", ref="58c0cab")
devtools::install_github(repo = "gmteunisse/fantaxtic", ref="b822d7f")

Files

All analysis codes are implemented in R and stored in the codes/processing directory. Each coding file is provided in two extensions: .ipynb and .Rmd

Dataset overview
- Figure 1. Data Overview. (Chi-squared test of demographic characteristics, sample counts).
- Figure 3. Microbial relative abundance barplots.
Microbial diversity
- Figure 2. Microbial diversity. (baseline vs follow-up groups).
- Figure 2. Microbial diversity. Baseline and follow-up groups stratified by baseline STH Status. (baseline vs follow-up groups).
- Supplementary Figure 2. Microbial diversity. Accounting for STH status.Rmd.
Dimensionality reduction
- Figure 2. PCoA based on Bray-Curtis distance. Baseline and follow-up groups.
Genera-vanishment analysis
- Figure 3. ASV vanishment analysis
Beta-diversity
- Figure 4. ALDEx2 analysis. Wilcoxon test.
- Supplementary Figure 4. ALDEx2 analysis. Wilcoxon with covariates.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
code		code
data		data
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of anthelmintic treatment dataset

Contributors

Overview

Code descriptions

Data Processing

Dependencies

Instructions

Pipeline files

Analysis

Dependencies

Files

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Analysis of anthelmintic treatment dataset

Contributors

Overview

Code descriptions

Data Processing

Dependencies

Instructions

Pipeline files

Analysis

Dependencies

Files

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages