-
Notifications
You must be signed in to change notification settings - Fork 146
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description of the bug
Hi,
Related to #921 but much simpler. Here, the specificEpithet, infraspecificEpithet, and (for non-BIN matches) scientificName fields in the SBDI export from annotation against COIDB become misaligned when genus ends with an underscore followed by one or more 'X's.
Example:
genus: Malacostraca_XXX → specificEpithet: Malacostraca, infraspecificEpithet: XXXX
I can fix this with the code below, but I assume it should be quite straightforward to handle robustly in ampliseq parsing as well.
# Flag BOLD BINs
annotation[, isBIN := grepl("^BOLD:[A-Z0-9]+$", scientificName)]
# Fix mis-split names
annotation[grepl("_[X]+$", genus), `:=`(
specificEpithet = "X",
infraspecificEpithet = "",
scientificName = ifelse(!isBIN, paste0(genus, specificEpithet), scientificName)
)
]
Regards,
Maria
Command used and terminal output
Relevant files
No response
System information
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working