Microbiome Analysis Pipeline (LCPM)

This repository contains a comprehensive pipeline for analyzing microbiome data from the LCMP cohort. The pipeline covers a range of analyses including data preprocessing, quality control, taxonomic profiling, statistical analysis, and visualization. Below is a detailed guide on how to use the pipeline for each step of the analysis.

1. Downloading fastq files and grouping them by sequencing batches

To begin the analysis, the first step is to download the fastq files from the repository. Make sure you have the necessary permissions and access to the metadata. Once you have the fastq files, group them by sequencing batches for easy management and analysis. Use the following steps:

Create a directory structure to organize your data, such as data/raw_data/ for storing the raw fastq files.
Place the fastq files in the corresponding sequencing batch directories within the data/raw_data/ directory.
Make sure to appropriately name the directories to reflect the sequencing batches and include relevant metadata if available.

2. Initial analysis using dada2 to obtain ASV (controlling for sequencing batch)

After the fastq files have been grouped into sequencing batches, the next step is to perform initial analysis using the dada2 package to obtain Amplicon Sequence Variants (ASVs) for each sequencing batch. Follow these steps:

Install the required dependencies, including R and the dada2 package.
Create a script, such as scripts/dada2_analysis.R, and load the necessary libraries.
Execute the dada2_analysis.R script.

3. Filtering unclassified and non-bacterial ASVs

Once you have obtained the ASVs for each sequencing batch, it is important to filter out unclassified and non-bacterial ASVs to focus on the microbial taxa of interest. Follow these steps:

Create a script, such as scripts/ASV_filtering.R, and load the necessary libraries.
Read the concatenated ASV file from the data/processed_data/ directory.
Implement filtering criteria to remove unclassified and non-bacterial ASVs based on taxonomic annotations.
Execute the ASV_filtering.R script.

4. Exploratory and quality control analysis

After filtering the ASVs, it is essential to perform exploratory and quality control analyses to gain insights into the dataset. Follow these steps:

Create a script, such as scripts/exploratory_analysis.R, and load the necessary libraries.
Read the filtered ASV file from the data/processed_data/ directory.
Execute the exploratory_analysis.R script.

5. Quantitative Microbiome Profiling (QMP) at ASV level

Quantitative Microbiome Profiling (QMP) at the ASV level enables the quantification of microbial abundance. Follow these steps to perform QMP:

Create a script, such as scripts/rmp_to_qmq_ASV.R, and load the necessary libraries.
Read the filtered ASV file from the data/processed_data/ directory.
Execute the rmp_to_qmq_ASV.R script.

6. Microbiota covariates identification

Identification of microbiota covariates helps in understanding the factors influencing microbial community composition. Follow these steps to identify microbiota covariates:

Create a script, such as scripts/covariate_identification.R, and load the necessary libraries.
Read the filtered ASV files from the data/processed_data/ directory.
Prepare the necessary metadata, such as sample characteristics, clinical variables, etc.
Execute the covariate_identification.R script.

7. Differential abundance analysis

To identify taxa showing differential abundance between different conditions or groups, follow these steps:

Create a script, such as scripts/differential_abundance.R, and load the necessary libraries.
Read the filtered ASV files from the data/processed_data/ directory.
Execute the differential_abundance.R script.

8. Taxa abundance associations

Investigating associations between taxa abundance and other variables can provide valuable insights. Follow these steps to analyze taxa abundance associations:

Create a script, such as scripts/abundance_associations.R, and load the necessary libraries.
Read the filtered ASV files from the data/processed_data/ directory.
Prepare the necessary metadata, including variables of interest for association analysis.
Execute the abundance_associations.R script.

9. Linear model analysis

Linear model analysis helps in exploring relationships between multiple covariates and microbial abundance. Follow these steps to perform linear models analysis:

Create a script, such as scripts/linear_models.R, and load the necessary libraries.
Read the filtered ASV files from the data/processed_data/ directory.
Prepare the necessary metadata, including multiple covariates of interest.
Execute the linear_models.R script.

10. Enterotyping

Enterotyping is a method to categorize individuals based on their gut microbiota composition. Follow these steps to perform enterotyping analysis:

Create a script, such as scripts/enterotyping.R, and load the necessary libraries.
Read the filtered ASV files from the data/processed_data/ directory.
Execute the enterotyping.R script.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
data		data
results		results
scripts		scripts
LICENSE		LICENSE
PCoA_DAA.Rmd		PCoA_DAA.Rmd
PCoA_DAA.html		PCoA_DAA.html
QMP_LCPM_SLV_species_full_tax.txt		QMP_LCPM_SLV_species_full_tax.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Microbiome Analysis Pipeline (LCPM)

1. Downloading fastq files and grouping them by sequencing batches

2. Initial analysis using dada2 to obtain ASV (controlling for sequencing batch)

3. Filtering unclassified and non-bacterial ASVs

4. Exploratory and quality control analysis

5. Quantitative Microbiome Profiling (QMP) at ASV level

6. Microbiota covariates identification

7. Differential abundance analysis

8. Taxa abundance associations

9. Linear model analysis

10. Enterotyping

About

Uh oh!

Releases

Packages

Languages

License

raeslab/QMP-Microbiome-CRC-confounders

Folders and files

Latest commit

History

Repository files navigation

Microbiome Analysis Pipeline (LCPM)

1. Downloading fastq files and grouping them by sequencing batches

2. Initial analysis using dada2 to obtain ASV (controlling for sequencing batch)

3. Filtering unclassified and non-bacterial ASVs

4. Exploratory and quality control analysis

5. Quantitative Microbiome Profiling (QMP) at ASV level

6. Microbiota covariates identification

7. Differential abundance analysis

8. Taxa abundance associations

9. Linear model analysis

10. Enterotyping

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages