Skip to content

PollyTikhonova/Taye_AnthelminticTreatment_2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analysis of anthelmintic treatment dataset

License CCO v.1 DOI

Code accompanying the manuscript "Microbiome responses to anthelmintic treatment depend on pre-treatment helminth infection status in young Ethiopian children".

Abstract: Mass deworming programs using effective anthelmintic drugs are essential for controlling soil-transmitted helminth (STH) infections, particularly in high-risk developing regions. However, it remains unclear whether routine deworming induces long-term alterations in gut microbiome composition, especially when accounting for individual infection status. This study aims to explore the changes in the gut microbiomes of Ethiopian children one year after the administration of an anthelmintic treatment.

Contributors

  • Polina Tikhonova, MSc

Overview

project
|- README         # repository descriptions
|- LICENSE        # the license for this project
|
|- data           # project data
| |- database/    # SILVA database directory
| |- reads/       # a directory for fastq files 
| |- raw/      # a directory for the raw reads
| +- filtered/ # (generated by the pipeline)
|                 # a directory, containing filtered reads
| |- QCcontrol    # (generated by the pipeline); QC files
| |- dada2        # (generated by the pipeline); dada2 files
| |- phyloseq     # (generated by the pipeline; provided);
| |               # resulting phyloseq object files in .rds and .csv formats
| +- metadata.tsv # a metadata file, provided
|
|- code/          # project codes
| |- analysis/    # analysis, generates results/ files
| +- processing/  # raw data processing, generates data/proccessed/ files
|
|- results        # output files, generated by the codes in analysis folder
| |- figures/     # manuscript figures
| +- tables/      # beta-diversity tables

Code descriptions

Please note that the codes in this repository were tested on Linux x86_64 system.

Data Processing

Dependencies

  • snakemake (v7.32.1)
    conda install -c bioconda -c conda-forge snakemake==7.32.1

Instructions

Please note that running the processing pipeline is optional since the final phyloseq objects necessary for the analysis are provided in this repository.

  1. Install dependencies (~10 min). In case of version incompatibility errors, please set the conda channel_priority to flexible.
  2. The raw data is publicly available at the European Nucleotide Archive (ENA) under accession number PRJEB93790. Please download the fastq.gz files to the data/reads/raw directory and follow the proposed directory structure: sample_name/sample_name_R*.fastq.qz. The metadata file should be saved as a data/metadata.tsv file (a tab-separated format).
  3. Download Silva database version 138 to the data/database directory. Please, make sure to download both files: silva_nr99_v138.2_toGenus_trainset.fa.gz and silva_v138.2_assignSpecies.fa.gz.
  4. (optional) In case of custom paths to the raw files and database, please modify the corresponding parameters in the code/processing/config.yaml file.
  5. Run Snakemodule
mkdir code/processing/logs
cd code/processing/logs
conda activate snakemake
snakemake --snakefile ../Snakemodule --use-conda --conda-frontend conda

Suggested parameters for running the Snakemodule using multiple sbatch jobs: --cluster "sbatch --time=47:00:00 --nodes=1 --ntasks=20 --mem=200GB" --jobs 10.

Pipeline files

codes/processing
|
|- Snakemodule
|- conda_env.*.yaml                      # conda environment libraries
|- config.yaml                           # Snakemodule paths and settings
+- params.dada2.filter_and_trim.yaml     # dada2 filter_and_trim settings

Snakemodule Steps

  1. FastQC/MultiQC quality assessment
    • 1.quality_control.smk
  2. Trimming and filtering
    • 2.filtering_trimming.smk
  3. FastQC/MultiQC quality reassessment
    • 1.quality_control.smk
  4. ASV identification using DADA2 pipeline (official tutorial)
    • 3.dada2.smk
  5. ASV taxonomy annotation
    • 3.dada2.smk
  6. Generation of the phyloseq object.
    • 3.dada2.smk
    • outputs:
      • unfiltered phyloseq (all ASVs)
      • filtered phyloseq (only microbial ASVs)
      • filtered phyloseq objects agglomerated at genus and family levels

Analysis

Dependencies

  1. Create and install a new conda environment (~30 min)
conda create -n anthelminthic_treatment
conda update -n anthelminthic_treatment --file code/analysis/conda_env.R.yaml
  1. Install additional R libraries:
conda activate anthelminthic_treatment
R
devtools::install_github(repo = "malucalle/selbal", ref="9f7ff2b")
devtools::install_github(repo = "gauravsk/ranacapa", ref="58c0cab")
devtools::install_github(repo = "gmteunisse/fantaxtic", ref="b822d7f")

Files

All analysis codes are implemented in R and stored in the codes/processing directory. Each coding file is provided in two extensions: .ipynb and .Rmd

  1. Dataset overview
    • Figure 1. Data Overview. (Chi-squared test of demographic characteristics, sample counts).
    • Figure 3. Microbial relative abundance barplots.
  2. Microbial diversity
    • Figure 2. Microbial diversity. (baseline vs follow-up groups).
    • Figure 2. Microbial diversity. Baseline and follow-up groups stratified by baseline STH Status. (baseline vs follow-up groups).
    • Supplementary Figure 2. Microbial diversity. Accounting for STH status.Rmd.
  3. Dimensionality reduction
    • Figure 2. PCoA based on Bray-Curtis distance. Baseline and follow-up groups.
  4. Genera-vanishment analysis
    • Figure 3. ASV vanishment analysis
  5. Beta-diversity
    • Figure 4. ALDEx2 analysis. Wilcoxon test.
    • Supplementary Figure 4. ALDEx2 analysis. Wilcoxon with covariates.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages