-
Notifications
You must be signed in to change notification settings - Fork 19
Comparative Genomics Exercise 6: Taxonomic profile of metagenomic samples
Metagenomic samples come from sequencing simultaneously all the DNA present in a sample, and usually contain thousands of sequences coming from different microbial species. Usually, the first approach for characterizing a metagenomic sample is to describe which species are present and in which abundance (ie. to build a taxonomic profile of the sample).
mOTUs is a program widely used
for measuring the abundance of different microbial species in
metagenomic samples. They have built a database with marker genes for
more than 7700 microbial species, against which the software maps the
reads from the samples and counts the number of reads matching every
marker gene from each species. mOTUS output is a tab-separated file
with the name of the screened species and their abundance in the
sample. mOTUS can tell us the relative abundance (abundance of each
species normalized by the total abundance in the sample) or the
absolute abundance (number of reads mapped to each reference species;
-c flag when calling mOTUs). You can also change the taxonomic level
of your screening with the -k flag (eg. -k phylum will only measure
the number of reads assigned to each phylum, not to each species).
Taxonomic profiles are constantly used for comparing samples with different origins and locating species enriched under some circumstances. For instance, they have been widely used to locate taxa enriched in samples from colorectal cancer patients compared to samples from control patients. We will run mOTUs on three human gut samples: two from control patients and one from a CRC patient.
$ motus profile -s /home/compgenomics/metagenomics/data/ERR688359.fastq.gz -o CTR.motus -t 20
$ motus profile -s /home/compgenomics/metagenomics/data/ERR688435.fastq.gz -o CRC.motus -t 20
$ motus profile -s /home/compgenomics/metagenomics/data/ERR688360.fastq.gz -o CTR_1.motus -t 20We can learn many things from our samples from the mOTUs output. For instance, we can measure the alpha diversity (microbial diversity within the sample). There are several alpha diversity measurements. The simplest one consists of counting the total number of species present in the sample, which we can do directly on the mOTUs output.
$ perl -F"\t" -lane 'print if $F[1]>0' CTR.motus | wc -l- How many species do you detect in each sample?
- Which one is the most abundant?
- Can we compare taxonomic profiles from samples directly from the mOTUs result?