Average Nucleotide Identity Analysis for Diagnosis

M. Asaduzzaman Prodhan^*

DPIRD Diagnostics and Laboratory Services, Department of Primary Industries and Regional Development

3 Baron-Hay Court, South Perth, WA 6151, Australia

^*Correspondence: Asad.Prodhan@dpird.wa.gov.au

Average Nucleotide Identity (ANI) analysis calculates the percentage of nucleotide identity among the supplied nucleotide sequences. It produces a square matrix of the calculated values. This matrix allows for pairwise comparisons among the nucleotide sequences and helps determine their similarities.

ANI methods

There are several methods to calculate the ANI:

ANIb (based on BLAST algorithm)
ANIm (based on MUMmer algorithm)
TETRA (based on tetranucleotide signature occurrences)

ANI tools

There are several tools available for ANI analysis (Figueras et al., 2014). For example:

JSpecies (http://www.imedea.uib.es/jspecies) (multiple genome analysis)
Gegenees (http://www.gegenees.org/documentation.html) (Due to changes in NCBI Blast, Gegeneese may not function with Blast versions 2.10 and later. Blast version 2.9 should work OK)
EzGenome (http://www.ezbiocloud.net/ezgenome/ani) [online]
ANI calculator (http://enve-omics.ce.gatech.edu/ani/index) [online] (only two genomes per analysis)
Python3 package (pyani) (https://github.com/widdowquinn/pyani, https://pyani.readthedocs.io/en/latest/run_anim.html) based on (Richter and Rossello´-Mo´ra, 2009). Pyani uses MUMmer algorithm.

How to run pyani

If you are working on HPC Cluster, load the required version of python
```
module load cray-python/3.10.10
```
Create a conda environment with the compatible version of python, matplotlib and pyani
```
conda create -n pyani_env python=3.10 "matplotlib<=3.7" "pyani>=0.2.12" -c bioconda
```
Activate the ani environment
```
conda activate pyani_env
```
Alternatively, you can use my conda environment for pyani
Download it HERE
Then activate it as follows
```
conda env create -f pyani_env.yml 
```
Check it has been installed. Copy the following command and hit enter
```
average_nucleotide_identity.py --help
```

The above command will show the flags/options of the pyani program

Install dos2unix for changing file format
```
conda install conda-forge::dos2unix
```
Check it has been installed. Copy the following command and hit enter
```
dos2unix
```
Make two metadata files and name them as ‘classes.txt’ (Fig. 1) and ‘labels.txt’ (Fig. 2)

Figure 1. Classes

Figure 2. Labels

Note, the first column is the nucleotide sequences names

Second column is the label of the nucleotide sequences

Make a directory and name it as ‘ANI’ for example
Within the ‘ANI’ directory, make another directory and name it as ‘genomes’ for example
Keep all the nucleotide sequences, ‘classes.txt’ and ‘labels.txt’ in the ‘genomes’ directory
Check the line terminator of the ‘classes.txt’ and ‘labels.txt’ files as follows

file *.txt

If ‘classes.txt’ and ‘labels.txt’ have CRLF (Windows) format, then convert them into Unix format as follows:

dos2unix *.txt

Run the following command from the ‘ANI’ directory

average_nucleotide_identity.py -i genomes -o output_ANI --labels genomes/labels.txt --classes genomes/classes.txt -g --gmethod seaborn --gformat pdf,png -v -l ba_ANI.log

Note that you do not make the output directory beforehand. Otherwise, the command will exit with an ‘overwriting’ error
Command reference: widdowquinn/pyani#56

Results

The final output of the ANI analysis looks like this (Fig. 3):

Figure 3. Results

References

Figueras, M.J., Beaz-Hidalgo, R., Hossain, M.J., Liles, M.R., 2014. Taxonomic Affiliation of New Genomes Should Be Verified Using Average Nucleotide Identity and Multilocus Phylogenetic Analysis. Genome Announc 2, e00927-14. https://doi.org/10.1128/genomeA.00927-14

Richter, M., Rossello´-Mo´ra, R., 2009. Shifting the genomic gold standard for the prokaryotic species definition | Proceedings of the National Academy of Sciences. PNAS 106, 19126–19131. https://doi.org/10.1073/pnas.0906412106

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
ANIm_percentage_identity.png		ANIm_percentage_identity.png
LICENSE		LICENSE
README.md		README.md
classes.PNG		classes.PNG
labels.PNG		labels.PNG
pyani_env.yml		pyani_env.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Average Nucleotide Identity Analysis for Diagnosis

M. Asaduzzaman Prodhan^*

Contents

ANI methods

ANI tools

How to run pyani

Download it HERE

Then activate it as follows

Results

References

About

Uh oh!

Releases

Packages

License

asadprodhan/Average-Nucleotide-Identity-ANI-analysis

Folders and files

Latest commit

History

Repository files navigation

Average Nucleotide Identity Analysis for Diagnosis

M. Asaduzzaman Prodhan*

Contents

ANI methods

ANI tools

How to run pyani

Download it HERE

Then activate it as follows

Results

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

M. Asaduzzaman Prodhan^*

Packages