|
| 1 | +--- |
| 2 | +title: 'nomenclator: a Python package for the automated generation of Latin binomials for Bacterial and Archaeal genera' |
| 3 | +tags: |
| 4 | + - Python |
| 5 | + - microbiology |
| 6 | + - taxonomy |
| 7 | + - nomenclature |
| 8 | + - bioinformatics |
| 9 | +authors: |
| 10 | + - name: Andrea Telatin |
| 11 | + orcid: 0000-0001-7619-281X |
| 12 | + affiliation: 1 |
| 13 | +affiliations: |
| 14 | + - name: Quadram Institute Bioscience, Norwich, UK |
| 15 | + index: 1 |
| 16 | +date: 28 October 2025 |
| 17 | +bibliography: paper.bib |
| 18 | +--- |
| 19 | + |
| 20 | +# Summary |
| 21 | + |
| 22 | +`nomenclator` is a Python package that automates the creation of linguistically valid Latin binomials for bacterial and archaeal taxa, based on the "Great Automated Nomenclator" script [@Pallen2021]. |
| 23 | +Bacterial nomenclature requires Latin or Latinized names that conform to the rules of the International Code of Nomenclature of Prokaryotes (ICNP) and Latin grammar [@Parker2019, @Oren2019]. |
| 24 | +The tool generates taxonomic names by combinatorially concatenating roots from |
| 25 | +Latin and Greek starting from two Excel files containing curated lists of roots to be combinatorially assembled into genus and species names. |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | +# Statement of Need |
| 30 | + |
| 31 | +The exponential growth in microbial species discovery through culturomics, genomics, and metagenomics has created an urgent need for millions of new taxonomic names—far exceeding the capacity of manual expert-driven nomenclature. |
| 32 | +`nomenclator` addresses this bottleneck by providing pre-generated, grammatically correct names that can be used "off the shelf" as needed. |
| 33 | + |
| 34 | +Creating valid taxonomic names is challenging because it requires: |
| 35 | + |
| 36 | +1. **Classical language expertise**: Names must follow Latin grammar rules with proper gender agreement and declension |
| 37 | +2. **ICNP compliance**: The nomenclature code contains 65 rules and numerous recommendations |
| 38 | +3. **Manual quality control**: Each name requires expert review, creating a significant bottleneck |
| 39 | + |
| 40 | + |
| 41 | +# Implementation and Features |
| 42 | + |
| 43 | +## Installation |
| 44 | + |
| 45 | +The package can be installed via `pip`: |
| 46 | + |
| 47 | +```bash |
| 48 | +pip install nomenclator |
| 49 | +``` |
| 50 | + |
| 51 | +## Tools |
| 52 | + |
| 53 | +`GAN` is implemented in pure Python (3.8+) with minimal dependencies. |
| 54 | +The package exports these CLI tools: |
| 55 | + |
| 56 | +- `gan-genus`: Generates genus names based on user-defined parameters (number of names, roots to use, etc.) |
| 57 | +- `gan-init`: Initializes a project directory with necessary files and templates |
| 58 | +- `xls2tsv`: Converts Excel files with taxonomic data into TSV format for further processing |
| 59 | + |
| 60 | +## Example input and output |
| 61 | + |
| 62 | +An example of the input Excel file structure is shown below: |
| 63 | + |
| 64 | +| Language | Gender | Part | Word | Root | Definition | Explanation | |
| 65 | +|----------|--------|------|-------------|-----------|--------------------------------------|-------------| |
| 66 | +| L. | masc. | n. | admissarius | admissari | a stallion used for breeding | horses | |
| 67 | +| Gr. | masc. | n. | Arion | ariono | a mythical horse that could speak | horses | |
| 68 | +| Gr. | masc. | n. | Balios | Balio | a mythical horse | horses | |
| 69 | +| L. | masc. | n. | caballus | caballi | a horse | horses | |
| 70 | + |
| 71 | +The programme's output can be saved as HTML or PDF files. An example is: |
| 72 | + |
| 73 | +* **Admissaristercoricola** - Etymology: *L. masc. n. admissarius*, a stallion used for breeding; *L. neut. n. stercus*, excrement; *N.L. masc./fem. n. cola*, an inhabitant; `Admissaristercoricola`: a microbe of the faeces of horses. |
| 74 | +* **Admissaristercoradaptatus** - Etymology: *L. masc. n. admissarius*, a stallion used for breeding; *L. neut. n. stercus*, excrement; *L. masc. n. adaptatus*, something adapted; `Admissaristercoradaptatus`: a microbe of the faeces of horses. |
| 75 | +* **Admissaristercorihabitans** - Etymology: *L. masc. n. admissarius*, a stallion used for breeding; *L. neut. n. stercus*, excrement; *L. masc. n. habitans*, an inhabitant; `Admissaristercorihabitans`: a microbe of the faeces of horses. |
| 76 | +# Acknowledgments |
| 77 | + |
| 78 | +This software originated from research conducted with Mark J. Pallen and Aharon Oren, published in *Trends in Microbiology* [@pallen2021], where it demonstrated the concept of mass nomenclature generation for prokaryotic taxonomy. |
| 79 | + |
| 80 | +# Funding |
| 81 | + |
| 82 | +The author gratefully acknowledge the support of the Biotechnology and Biological Sciences Research Council (BBSRC); this research was funded |
| 83 | +by the BBSRC Core Capability Grant BB/CCG2260/1 |
| 84 | +and by the BBSRC Institute Strategic Programme Microbes and Food Safety |
| 85 | +BB/X011011/1 and its constituent project |
| 86 | +BBS/E/QU/230002C. |
| 87 | + |
| 88 | +# References |
0 commit comments