If you use results from this tool, please cite
Coelho, L.P., Alves, R., del Río, Á.R. et al. Towards the biogeography of prokaryotic genes. Nature 601, 252–256 (2022). [https://doi.org/10.1038/s41586-021-04233-4](DOI: 10.1038/s41586-021-04233-4)
Command line tool to query the Global Microbial Gene Catalog (GMGC).
GMGC-mapper runs on Python 3.6-3.10 and requires prodigal to be available for genome mode.
The easiest way to install GMGC-mapper is through bioconda, which will ensure
all dependencies (including prodigal) are installed automatically:
conda install -c bioconda gmgc-mapperAlternatively, GMGC-mapper is available from PyPI, so can be installed
through pip:
pip install GMGC-mapperNote that this does not install prodigal (which is necessary for the
genome-based workflow).
Finally, especially if you are retrieving the cutting edge version from Github, you can install with the standard
python setup.py install- Input is a genome sequence.
gmgc-mapper -i input.fasta -o output- Input is DNA/protein gene sequences
gmgc-mapper --nt-genes genes.fna --aa-genes genes.faa -o outputThe nucleotide input is optional (but should be used if available so that the quality of the hits can be refined):
gmgc-mapper --aa-genes genes.faa -o outputIf yout input is a metagenome, you can use NGLess for assembly and gene prediction. For more details, read the docs.
The output folder will contain
- Outputs of gene prediction (prodigal).
- Complete data table, listing all the hits in GMGC, per gene.
- Complete table, listing all the genome bins (MAGs) that are found in the results.
- Human readable summary.
For more details, read the docs. A description of the outputs is also written to output folder for convenience.
-
-i/--input: path to the input genome file (FASTA, possibly .gz/.bz2/.xz compressed). -
-o/--output: Output directory (will be created if non-existent). -
--nt-genes: path to the input DNA gene file (FASTA, possibly .gz/.bz2/.xz compressed). -
--aa-genes: path to the input Protein gene file (FASTA, possibly .gz/.bz2/.xz compressed).