Metagenome assembly reconstructs individual genomes from complex microbial communities. Long-read sequencing (ONT, PacBio) dramatically improves contiguity and enables recovery of complete genomes.
# Assembly
conda install -c bioconda flye spades
# Binning
conda install -c bioconda metabat2 semibin checkm2
# Taxonomy
conda install -c bioconda gtdbtk
# Utilities
conda install -c bioconda minimap2 samtools seqkitTell your AI agent what you want to do:
- "Assemble this ONT metagenome with Flye"
- "Bin my metagenome assembly into MAGs"
- "Assess MAG quality with CheckM2"
"Assemble this ONT metagenome with Flye" "Run metaSPAdes on my Illumina metagenome reads" "Create a hybrid assembly from short and long reads"
"Bin my metagenome assembly into MAGs" "Run MetaBAT2 binning on my assembled contigs" "Use SemiBin2 for deep learning-based binning"
"Find complete circular genomes in my assembly" "Assess MAG quality with CheckM2" "Classify my MAGs with GTDB-Tk"
- Select appropriate assembler based on input data type
- Run metagenome assembly with optimized parameters
- Map reads back to assembly for coverage calculation
- Bin contigs into putative genomes (MAGs)
- Assess MAG quality with CheckM2
- Assign taxonomy with GTDB-Tk
- Report quality-filtered MAG statistics
- metaFlye is recommended for long reads; metaSPAdes for Illumina only
- Long reads often recover complete circular genomes directly
- Multiple binning tools (MetaBAT2 + SemiBin2) can improve recovery
- High-quality MAGs: >90% complete, <5% contamination
- GUNC can detect chimeric MAGs missed by CheckM
- Consider co-assembly of related samples to improve binning