Skip to content

Biosample geolocation data capture proposal #396

@v-rocheleau

Description

@v-rocheleau

This is a description of an extension we implemented in order to capture geolocation information in Phenopacket Biosamples.
Our use case is for non-human datasets, but could just as well be used on human studies where this information is relevant.

Geolocation use case slides to be presented at the next GA4GH Phenopackets WG meeting

Problem statement

With human data, biosamples are taken in a clinical context in an hospital or clinic.
On the other hand, with non-human datasets the sample collection process involves a lot of leg work.

As an example, take a study where biologists are collecting data on a population of wild animals.
Researchers may collect samples directly from animals, or from what they leave behind (droppings, fur, etc).

Samples may be collected over very large areas, and over long periods of time.
The location of collection for the samples can be very important to capture, as well as the time of collection.
Also, being able to capture geolocation would be of great interest for environmental samples like eDNA.

Currently, Phenopacket does not support the capture of geolocation data in any way.

Proposal

Add a location_collected field to Biosample, analogous to time_of_collection but for the spacial dimension.

  • Type GeoLocation (SchemaBlocks)
    • Allows the capture of a GeoJSON "Point" object for precise lat/long coordinates
    • Allows the capture of human-readable location descriptions
  • Multiplicity: 0..1
    • This field should only be used when needed by the study
    • A sample is only collected at one location

We have already implemented this proposed schema extension in our platform, since this is a requirement for environmental and agricultural studies we need to support in the future.

We believe the proposed change is non-breaking and generic enough to benefit others in the Phenopackets community.

The GeoLocation schema block uses the widely adopted GeoJSON standard to represent coordinates, making it interoperable with most geo-information-system (GIS) tools.

Although our use case is non-human, GeoLocation (SchemaBlocks) seems to have been made with human data in mind as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions