-
Notifications
You must be signed in to change notification settings - Fork 31
Description
This is a description of an extension we implemented in order to capture geolocation information in Phenopacket Biosamples.
Our use case is for non-human datasets, but could just as well be used on human studies where this information is relevant.
Geolocation use case slides to be presented at the next GA4GH Phenopackets WG meeting
Problem statement
With human data, biosamples are taken in a clinical context in an hospital or clinic.
On the other hand, with non-human datasets the sample collection process involves a lot of leg work.
As an example, take a study where biologists are collecting data on a population of wild animals.
Researchers may collect samples directly from animals, or from what they leave behind (droppings, fur, etc).
Samples may be collected over very large areas, and over long periods of time.
The location of collection for the samples can be very important to capture, as well as the time of collection.
Also, being able to capture geolocation would be of great interest for environmental samples like eDNA.
Currently, Phenopacket does not support the capture of geolocation data in any way.
Proposal
Add a location_collected
field to Biosample
, analogous to time_of_collection
but for the spacial dimension.
- Type GeoLocation (SchemaBlocks)
- Allows the capture of a GeoJSON "Point" object for precise lat/long coordinates
- Allows the capture of human-readable location descriptions
- Multiplicity: 0..1
- This field should only be used when needed by the study
- A sample is only collected at one location
We have already implemented this proposed schema extension in our platform, since this is a requirement for environmental and agricultural studies we need to support in the future.
We believe the proposed change is non-breaking and generic enough to benefit others in the Phenopackets community.
The GeoLocation schema block uses the widely adopted GeoJSON standard to represent coordinates, making it interoperable with most geo-information-system (GIS) tools.
Although our use case is non-human, GeoLocation (SchemaBlocks) seems to have been made with human data in mind as well.