-
Notifications
You must be signed in to change notification settings - Fork 4
ClassesAndTypes
In the Darwin Core standard as ratified in 2009 (see http://rs.tdwg.org/dwc/terms/ which also represents the namespace abbreviated here as dwc:), classes are simply a means to group similar terms. The class definitions in RDF do not describe any formal relationship to other terms in the standard.1 Rather, the DwC classes suggest the kinds of things that might be described by the terms that are grouped within the class.2
DwC has a type vocabulary (http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm) which provides terms that can be used to describe the nature of resources used in the biodiversity informatics community. These types can be represented by URIs in the namespace http://rs.tdwg.org/dwc/dwctype/ which may be abbreviated as the qualified namespace dwctype: . An exception is http://purl.org/dc/dcmitype/Event (dctype:Event) which is imported from the DCMI type vocabulary. To some extent, the DwC types reflect the DwC classes. Most (but not all) DwC classes have a corresponding DwC type. However, there are additional DwC types that describe resources that are not represented by classes in DwC, e.g. PreservedSpecimen . Another important distinction between the DwC classes and types is that the RDF descriptions3 of some DwC types express formal relationships with other DwC types. In particular, PreservedSpecimen is declared to be a subclass of Occurrence, Occurrence is declared to be a subclass of Event. Thus use of the DwC type URIs in RDF may carry semantic meaning that the user may not intend. (Decision 2011-10-16_6 at http://rs.tdwg.org/dwc/terms/history/decisions/index.htm removed all subclass declarations from the DwC type vocabulary.)
The DwC term dwc:basisOfRecord has a special purpose in the standard. It is defined as "The specific nature of the data record - a subtype of the dcterms:type. Recommended best practice is to use a controlled vocabulary such as the Darwin Core Type Vocabulary (http://rs.tdwg.org/dwc/terms/type-vocabulary/index.htm)." 4 5 It should be noted that despite this definition, basisOfRecord does not have a formal relationship with dcterms:type 6. Because the DwC standard does not have any particular representation, values of basisOfRecord are generally text strings and in that form they are not linked semantically to the DwC type vocabulary. So although it is recommended to use the DwC type vocabulary as the controlled vocabulary of basisOfRecord, there is no formal requirement that it be used, nor does use of the string values necessarily imply the subclassing relationships expressed in the RDF definition of the DwC types.
It should also be noted that classes are also defined in the TDWG Ontology (http://code.google.com/p/tdwg-ontology/). There is a great deal of similarity between these classes and the DwC classes. However, there is no formal relationship between DwC and the TDWG Ontology, so it cannot necessarily be assumed that a DwC class of a certain name corresponds to a class of the same name in the TDWG ontology. The TDWG ontology has the additional problem that it has never been completed and that the formally defined relationships among the classes do not necessarily represent the consensus view of the community of DwC users.
1 http://lists.tdwg.org/pipermail/tdwg-content/2010-January/000225.html
2 http://rs.tdwg.org/dwc/terms/index.htm
3 http://code.google.com/p/darwincore/source/browse/trunk/rdf/dwctype.rdf
4 http://rs.tdwg.org/dwc/terms/index.htm#basisOfRecord
5 http://lists.tdwg.org/pipermail/tdwg-content/2010-November/001823.html
6 http://lists.tdwg.org/pipermail/tdwg-content/2009-October/000286.html
In the RDF Schema (RDFS, namespace http://www.w3.org/2000/01/rdf-schema# which is abbreviated rdfs:), there is a direct, formal relationship between class membership and type declaration.2 If a resource is declared to have a rdf:type C, then that resource is an instance of class C.3 The reverse is true: if a resource is described as an instance of class C, then it has type C even if there is no explicit type declaration. Consider the following example where dctype: = http://purl.org/dc/dcmitype/
Example 1
<rdf:Description about="http://my.org/Thing">
<rdf:type resource="http://purl.org/dc/dcmitype/StillImage" />
... other properties of Thing...
</rdf:Description>
Example 2
<dctype:StillImage about="http://my.org/Thing">
...properties of Thing ...
</dctype:StillImage>
In Example 1, Thing is explicitly described as being of type dctype:StillImage . In the Example 2, the XML container element (a "typed node element"4) name characterizes Thing as an instance of the class dctype:StillImage. Despite the difference in syntax, both of these examples make exactly the same assertions about Thing. Thing is an instance of the class dctype:StillImage and it has the type dctype:StillImage .
In RDF, there is no restriction that a resource must be a member of only a single class. Thus in the following example
Example 3
<dctype:StillImage about="http://my.org/Thing">
<rdf:type resource="http://xmlns.com/foaf/0.1/Image" />
...other properties of Thing ...
</dctype:StillImage>
(where the namespace http://xmlns.com/foaf/0.1/ is abbreviated foaf:) Thing is an instance of both the foaf:Image and dctype:StillImage classes and has two declared types,foaf:Image and dctype:StillImage .
1 For more discussion on this topic, see http://lists.tdwg.org/pipermail/tdwg-content/2010-November/001945.html
2 http://www.w3.org/TR/rdf-schema/#ch_type
3 http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001817.html
4 http://www.w3.org/TR/2004/REC-rdf-primer-20040210/#example13
In RDFS, the assignment of domains and ranges to a property has an effect which may not be intuitive.1 2 It might seem logical that if a property P is declared to have rdfs:domain C, then P could only be applied to resources that are instances of class C. This is NOT the case. Rather, if P has rdfs:domain C, then applying P to a subject resource asserts that P is in instance of C.3
For example, the property foaf:depicts has domain foaf:Image 4. Thus if you make the statement
http://my.org/Thing foaf:depicts http://my.org/fred_smith
then in addition to asserting that Thing is a depiction of Fred Smith, you are also declaring Thing to be a foaf:Image, even if you make no explicit type declaration in your RDF. For example
Example 4
<dctype:StillImage about="http://my.org/Thing">
<foaf:depicts resource="http://my.org/fred_smith" />
... other properties of Thing...
</dctype:StillImage>
declares Thing to be of both type dctype:StillImage and foaf:Image just as was the case in Example 3.
The same type of thing happens with range declarations. The term foaf:depiction (which is the inverse property of foaf:depicts)5 has the rdfs:range of foaf:Image . Thus any object of the property foaf:depiction is implicitly declared to be a foaf:Image . In the example
Example 5
<rdf:Description about="http://my.org/fred_smith">
<foaf:depiction resource="http://my.org/Thing" />
... other properties of fred_smith...
</rdf:Description>
not only does the Description say that Fred Smith has the depiction Thing, but the range specification of foaf:depiction also means that Thing is being declared to be a foaf:Image .
Because the Darwin Core standard was intended for the broadest possible use, a conscious decision was made to avoid assigning domains and ranges to DwC terms.6 This decision provided considerable flexibility in using terms that may actually apply to several DwC classes and prevented problems when it was not entirely clear to which class a term actually belonged.
1 http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001753.html
2 http://lists.tdwg.org/pipermail/tdwg-content/2010-October/001766.html
3 http://www.w3.org/TR/rdf-schema/#ch_domain
4 http://xmlns.com/foaf/spec/#term_depicts
5 http://xmlns.com/foaf/spec/#term_depiction
6 http://lists.tdwg.org/pipermail/tdwg-content/2010-January/000225.html see also http://lists.tdwg.org/pipermail/tdwg-content/2009-June/000440.html http://lists.tdwg.org/pipermail/tdwg-content/2009-July/000429.html http://lists.tdwg.org/pipermail/tdwg-content/2009-July/000428.html http://lists.tdwg.org/pipermail/tdwg-content/2009-August/000388.html and related posts for historical details.
In RDF, clarifying the type of a resource is generally a best-practice. The practice is also an assumption in our community in the context of the use of GUIDs. The TDWG draft standard GUID applicability statement (http://www.tdwg.org/standards/150/) recommends that "Objects in the biodiversity domain that are identified by a GUID should be typed using the TDWG ontology or other well-known vocabularies..." (recommendation 11). Because the TDWG ontology is for the most part not really functional, the DwC classes are probably the most suitable means for typing resources that are adequately described by DwC. In darwin-sw, we import the DwC classes Event, Occurrence, Identification, and Taxon, as well as the class dcterms:Location which was imported into DwC itself. We also import foaf:Person and foaf:Document, which are well-known outside the biodiversity informatics community. We define the class dsw:IndividualOrganism, whose purpose was discussed extensively (as the proposed class dwc:Individual) on the tdwg-content mailing list from October-November 2010 and the class dsw:Token, whose purpose is described in the darwin-sw wiki. darwin-sw is built upon these classes and users of the darwin-sw ontology should use these classes to type the resources they describe with it. As noted in the section below on Domains and Ranges in darwin-sw, in many cases membership in these classes will be asserted implicitly if the terms that darwin-sw uses to describe the relationships between resources are applied as properties.
Most of the classes represented in darwin-sw are described as disjoint with other classes represented witin darwin-sw . Thus a reasoner would consider it an error if a resource were typed as more than one darwin-sw class. Based on the lengthy discussion on the tdwg-content email list during October-November 2010, there seems to be a consensus on the delineation of most of the DwC classes (i.e. they do not overlap), which has been summarized in the wiki. Therefore, in most cases is it not appropriate to type a resource as more than one darwin-sw class. However, there is nothing that prohibits a resource from being typed as additional classes outside of darwin-sw.
In early discussion about the creation of darwin-sw, we considered whether it was appropriate to assign domains and ranges to properties intended for use with objects described by the ontology. Since it was assumed that most of the data properties (i.e. string literals) would be described with terms from the DwC vocabulary we did not assign domains and ranges to those terms, although we provide guidance on each class wiki page as to some terms that we feel would be appropriately used to describe instances of that class.
However, the object properties defined in darwin-sw and used to describe the relationship between instances of one darwin-sw class and instances of another HAVE in most cases been assigned domains and ranges. One of the purposes of darwin-sw is to create a common means to describe resources using RDF that reflects the consensus of the DwC/TDWG community. It is therefore not appropriate to apply the object properties defined in darwin-sw in a way that implies some meaning other than the consensus embodied in the ontology. If a user of darwin-sw uses the object properties in a careless way that creates logical inconsistencies, that is a red flag which indicates that the metadata so described should not be aggregated with other metadata that uses darwin-sw consistently. In this sense, darwin-sw is NOT a general-purpose vocabulary in the way that the DwC standard is.
The other rationale for specifying domains and ranges is that it makes it possible to implicitly type resources that are outside of the control of the user by linking to those resources using a darwin-sw object property. For example, if a place has been assigned a URI and described in RDF, a darwin-sw user can assert that the place is a dcterms:Location by linking a dwc:Event described by the user to that place using the object property dsw:locatedAt (which has the range dcterms:Location).