# dwc:Taxon #
## What is a Taxon? ##
A taxon is a unit of biodiversity (Hyam and Kennedy,2006) e.g. species, genus, family, etc. However, taxonomists consider a particular taxon to be defined not only by a name, but also by an understanding of what organisms should be included in the taxon under that name. Taxa can be referenced in varying degrees of detail:
1. **Nominal Taxon Concept** (a.k.a."nominal taxon" and "nominal concept") - reference is made to a taxon name but no information is available about how the user intends for that name to be applied. Such instances could be referenced by "placeholder" Taxon instances which contained with no nameAccordingTo, and no other metadata besides the scientificName.1 2
1. **Taxon Name Usage (TNU)** (synonym: **assertion**, defined in Pyle,2004,p.20) - reference is made to a taxon name with some indication of what the author intends the name to mean. This may or may not include published details about how the author intends to define the taxon.3 4
1. **Taxon Concept** - reference is made to a taxon name along with a publication which explains how the author intends for the name to be applied (Kennedy et al.2005, p.81). The name used is followed by "_sec._" (_secundum_) or "_sensu_", then a citation for the reference which explains the author's intention. A taxon concept may be considered to be a more rigorously defined TNU5 6; a "defined" taxon concept rather than an "implied" taxon concept (i.e. TNU). 7
A taxon concept can be defined by a **circumscription** which refers to a collection of specimens that represent the concept. Alternatively, a taxon concept can defined by describing the characters that define the concept (Kennedy et al., 2005, p.84).
**Note:** A reference defining a taxon concept is not necessarily synonymous with the reference which published the name (although it could be if the intended concept was also expressed by the author when the name was published). Thus a text representation identifying a taxon concept could be:
genus+specificEpithet+scientificNameAuthorship+ "_sec._" (or "_sensu_")+ conceptAuthor+publicationDate
for example:
_Aus bus_ L. 1758 _sec._ Archer 1965
where the species name was published by Linnaeus in 1758 and the taxon concept describing _Aus bus_ was published by Archer in 1965. The Darwin Core term nameAccordingTo is for the concept publication, whereas the namePublishedIn term is for the name publication.
**Note** If no nameAccordingTo or nameAccordingToID is given explicitly given for a Taxon record, the nominal concept as defined by TCS should be assumed - so the Taxon could be a taxon concept **OR** a nominal concept. This goes against the increased clarity we hope for from the SW. One possible solution would be a literal value for nameAccordingTo which stated "nominal concept" or some other controlled value. This should be the subject of further discussion.
1 http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001703.html
2 http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001814.html
3 http://lists.tdwg.org/pipermail/tdwg-content/2009-November/000268.html
4 http://lists.tdwg.org/pipermail/tdwg-content/2010-November/001831.html
5 http://lists.tdwg.org/pipermail/tdwg-content/2010-November/001820.html
6 http://lists.tdwg.org/pipermail/tdwg-content/2010-November/001835.html
7 http://lists.tdwg.org/pipermail/tdwg-content/2010-November/001840.html
Note: there are numerous threads on the tdwg-content mailing list about the subject of taxa and taxon concepts. I do not have the stamina to summarize them all here, but some entry points to several threads are:
http://lists.tdwg.org/pipermail/tdwg-content/2010-June/000188.html
http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001585.html
http://lists.tdwg.org/pipermail/tdwg-content/2010-November/001819.html
## Equivalence of "Taxon" and "TaxonConcept" in the TDWG Ontology and the Darwin Core standard ##
With the Taxon class we followed our usual practice of importing a class from Darwin Core (DwC) to serve as the corresponding darwin-sw class. Unfortunately, there is some lack of clarity about what exactly constitutes an instance of Taxon in the DwC standard. The terms listed under Taxon in DwC are a mixture of terms which could apply to both taxa and names. The meaning of Taxon is clearer in the TDWG Ontology (see below). In particular, Taxon and TaxonConcept are defined to be equivalent classes according to their definitions under the URI http://rs.tdwg.org/ontology/voc/TaxonConcept.owl which is hosted and can be viewed at http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl . The Taxon component of the TDWG Ontology seems to be the part of the Ontology that is the most well-supported by other standards (e.g. TCS; see below), it is the part of the Ontology to which the most effort seems to have been applied, and it also seems to be the only part of the Ontology that has (to our knowledge) actually been implemented by anyone (see examples below).
We have taken the position that the less well-defined Darwin Core Taxon class is equivalent to Taxon and TaxonConcept in the TDWG Ontology. We justify this action on the following basis. It has been noted by some of the architects of the DwC standard that the term dwc:taxonID is intended to represent a "nameUsageID"1 and thus instances of TNUs could be described using the vocabulary of the dwc:Taxon class. Since taxon concepts can be considered a more narrowly defined subset of TNUs, by extension terms in the dwc:Taxon class could apply to them as well. We note that because terms in the Darwin Core standard do not have strictly defined domains and because the DwC standard does not explicitly define a class for names or sensu/secundum references, there are also some terms listed under the Taxon class that may apply specifically to names or sensu/secundum references rather than TNUs/taxon concepts.
1 http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001814.html
## TCS model ##
The TDWG Taxonomic Concept Transfer Schema (TCS; http://www.tdwg.org/standards/117/) is an XML schema which provides a means for transferring information about the definitions of taxon concepts. It is a ratified TDWG standard. It is **not** a schema for transfer of metadata representing instances of taxon concepts.
* See especially the document: UserGuidev\_1.3.pdf.
* The XML schema is at: TcsSchema.
## The TDWG Ontology ##
Note: the qualified namespace **tc:** is
http://rs.tdwg.org/ontology/voc/TaxonConcept#
which is can be viewed at
http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl
The qualified namespace **tn:** is
http://rs.tdwg.org/ontology/voc/TaxonName#
which can be viewed at
http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonName.rdf
The qualified namespace **tcom:** is
http://rs.tdwg.org/ontology/voc/Common#
which can be viewed at
http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/Common.rdf
**What is the TDWG Ontology?**
The historical background of the TDWG Ontology can be constructed by reading the TDWG Technical Architecture Group (TAG) "Technical Roadmaps" published from 2006 to 2008.
http://www.tdwg.org/uploads/media/TAG_Roadmap_01.doc (2006)
http://www.tdwg.org/fileadmin/subgroups/tag/TAG_Roadmap_2007_final.pdf (2007)
http://www.tdwg.org/fileadmin/subgroups/tag/TAG_Roadmap_2008.pdf (2008)
They do not seem to have been published since then. The TDWG Ontology was intended to be a technology-independent mechanism for typing objects in the biodiversity domain. It was written in Web Ontology Language (OWL),in the form of RDF. The TDWG Ontology describes many kinds of resources was intended at the time to facilitate LSID resolution. The structure of the Ontology is significantly more complex than Darwin Core and includes types of resources beyond those represented in the DwC classes. Development of the Ontology seems to have been abandoned in about 2008 at around the same time that efforts to adopt LSIDs also stalled. Lack of sustained support and funding seems to have been a factor in its abandonment (http://www.hyam.net/blog/archives/643#more-643).
Although the TDWG Ontology is the only vocabulary that is specifically mentioned in the TDWG GUID Applicability Statement (now a TDWG standard) as an appropriate typing mechanism (http://www.tdwg.org/standards/150 Recommendation 11) it has not been widely used for typing in RDF. Cases where it is actually being referenced in RDF seem to be descriptions of taxa (i.e. taxon concepts), taxon names, and "according to" references, with the latter two being components of taxa. To some extent, the structure of the TDWG Ontology follows the TCS model. However, since the TCS model is an XML schema and not defined in RDF, the TDWG Ontology makes reference to it through the property tcom:tcsEquivalence whose object is a string literal references to part of TCS. RDF examples of taxa described by the TDWG Ontology:
http://biodiversity.org.au/apni.taxon/118883.rdf
provided by Paul Murray in http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002204.html
http://taxon.luomus.fi/rdf.php?lsid=urn:lsid:luomus.fi:taxonconcept:95a2abc0-5e47-4f4d-b161-4880ae982dcf:1
provided by Hannu Saarenmaa (personal communication). For comparison purposes
http://lod.taxonconcept.org/ses/v6n7p
provided by Pete DeVries. Note: rdf:type and other properties not defined in terms of the TDWG ontology. Also, I can't find any "_sec._" (accordingTo) type properties. See http://lists.tdwg.org/pipermail/tdwg-content/2011-May/002405.html , http://lists.tdwg.org/pipermail/tdwg-content/2011-May/002390.html and subsequent posts for an explanation by Pete of what he intends txn:SpeciesConcept to mean.
## Strategy for darwin-sw suggested by the TCS model ##
The darwin-sw ontology does not define any properties for the Taxon class except for dsw:taxonOfId which links a Taxon to an Identification which refers to it. This was a conscious decision which recognizes:
1. that the realm of taxon, name, and sensu is complex area that probably requires additional discussion and consensus for full semantic web implementation
1. that a significant amount of work has already been done to describe the ontological relationships among resources in that realm
1. and that we (Steve Baskauf and Cam Webb) aren't experts in the area of taxonomy informatics.
Nevertheless, the Taxon realm is an important one which must be described in order for darwin-sw to have any usefulness. The following is a suggestion of how existing ideas about representing taxa, names, and sensu resources could be utilized as a part of darwin-sw.
The description of a TaxonConcept on p. 89 of Kennedy et al., 2005 provides ideas on how the Taxon class could be represented in darwin-sw. In each of the two required components of a taxon concept, Name and AccordingTo, the TCS model contains a simple string representation ("NameSimple" and "AccordingToSimple") as well as an optional NameDetailed and AccordingToDetailed which are references to other resources that describe the Name and AccordingTo resources more thoroughly. In our model, the "simple" representations could be made using string literal properties that are already terms in Darwin Core (in keeping with the general strategy outlined for darwin-sw). There are also specific predicates within tc: for string representations of the "simple" properties: tc:nameString and tc:accordingToString although these don't have the capability for "atomizing" the parts (vs. Darwin Core which could express the name as dwc:genus, dwc:specificEpithet, and dwc:scientificNameAuthorship).
The "detailed" representations can be URI references to resources that are not defined by darwin-sw, but rather by the functioning parts of the TDWG ontology (i.e. the tc: namespace). In particular, because darwin-sw declares the dwc:Taxon class to be equivalent to tc:Taxon (which is itself equivalent to tc:TaxonConcept), the tc:hasName and tc:accordingTo properties of the tc:TaxonConcept in the TDWG ontology are therefore properties of the darwin-sw Taxon class. This action gets darwin-sw out of the business of trying to define relationships that have already been defined elsewhere and are functional. It would also avoid embroiling our effort to represent Darwin Core terms in RDF in the endless debate about the minutiae of things taxonomic. Essentially, darwin-sw is "connected" to the taxonomy components of the TDWG Ontology.
Although the design of darwin-sw is intended to encourage and facilitate the use of object properties and URIs to refer to resources (as opposed to string literal references) which might be widely used by the Linked Data community, as a practical matter consensus, resolvable URIs for taxa, names, and sensu/secundum references are not currently (as of Jan 2011) available.1 2 Hopefully, consensus URIs for taxa will be available when the Global Name Usage Bank (GNUB)3 becomes functional. The only large, currently functional4 source of resolvable URIs for both plants and animals that produces RDF is uBio5, which uses HTTP proxied LSIDs. These URIs could be used as the objects of the tc:hasName property of a Taxon instance. Hopefully the Global Names Architecture (GNA)6 will eventually provide resolvable URIs. Currently uBio is the largest source of names to the GNA7 so all names currently assigned identifiers by uBio should eventually turn up in the GNA.
A system for providing sensu/secundum reference URIs is not well developed. In some cases, literature references have been assigned DOIs, which can be used in LOD as HTTP proxied GUIDs.8 However, much of the older literature does not have assigned DOIs and there are often not the necessary URIs to refer to sections of articles.9 Older literature is being databased by the Biodiversity Heritage Library (BHL)10 which provides stable URIs11 (but which apparently are not capable of resolving to RDF representations), but much literature still is not scanned and databased. So for the immediate future, many sensu/sec references will have to be referred to in only text form as literal objects of tc:accordingToString.
1 http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002202.html and subsequent posts in that thread
2 http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002231.html and subsequent posts in that thread
3 http://www.globalnames.org/GNUB
4 http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002231.html
5 http://www.ubio.org/
6 http://www.globalnames.org/
7 http://gni.globalnames.org/data_sources
8 http://www.crossref.org/CrossTech/2011/04/content_negotiation_for_crossr.html See also: http://odontomachus.wordpress.com/2011/05/04/crossrefs-gift-of-metadata/ for commentary on this.
9 http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002213.html
10 http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002206.html
11 http://lists.tdwg.org/pipermail/tdwg-content/2011-January/002238.html
## Properties suitable for use in DSW (not a complete list) ##
### Object properties defined in the darwin-sw ontology ###
* **dsw:taxonOfId** range Identification
### Data properties defined outside the darwin-sw ontology ###
* **tc:nameString** string literal of scientific name plus author (see http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl)
* **tc:accordingToString** string literal describing the publication that makes the assertion (see http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl)
### Object properties defined outside the darwin-sw ontology ###
* **tc:hasName** range tn:TaxonName of the TDWG ontology (but no formal range declared, see http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl)
* **tc:accordingTo** range a reference to the publication that makes the assertion (but no formal range declared, see http://code.google.com/p/tdwg-ontology/source/browse/trunk/ontology/voc/TaxonConcept.owl).
## General References: ##
Note: this is by no means a comprehensive list. We will add to it as we discover or remember additional references.
Franz, N.M., R.K. Peet and A.S. Weakley. 2008. On the use of taxonomic concepts in support of biodiversity research and taxonomy; pp. 63-86. In Wheeler, Q.D. (Ed.): The New Taxonomy, Systematics Association Special Volume Series 74. Taylor & Francis, Boca Raton, FL http://www.bio.unc.edu/faculty/peet/pubs/cardiff.pdf doi:10.1.1.103.7001 (maybe)
Franz, N.M and D. Thau. 2010. Biological taxonomy and ontological development: scope and limitations. Biodiversity Informatics 7:45-66. https://journals.ku.edu/index.php/jbi/article/view/3927
Franz, N.M. and J. Cardona-Duque. 2013 Description of two new species and phylogenetic
reassessment of Perelleschus O'Brien & Wibmer, 1986 (Coleoptera: Curculionidae), with a complete
taxonomic concept history of Perelleschus sec. Franz & Cardona-Duque, 2013. Systematics and Biodiversity 11(2):209-236. http://dx.doi.org/10.1080/14772000.2013.806371
Hyam, R. and J. Kennedy. 2006. Taxon Concept Schema - User Guide. http://www.tdwg.org/standards/117/
Kennedy, J.B., R. Kukla, and T. Paterson. 2005. Scientific names are ambiguous as identifiers for biological taxa: their context and definition are required for accurate data integration. Data Integration in the Life Sciences 2005, LNBI 3615, 80-95. http://www.springerlink.com/content/7bv5pa3falxwrrvx/
Page, R.D.M. 2006. Taxonomic names, metadata, and the semantic web. Biodiversity Informatics 3:1-15. https://journals.ku.edu/index.php/jbi/article/view/25
Pyle, R.L. 2004. Taxonomer: a relational data model for managing information relevant to taxonomic research. PhyloInformatics 1:1-54. http://systbio.org/files/phyloinformatics/1.pdf