Mapping to BioLink Model

# Mapping between BioKno and BioLink
This is a first mapping between the two models. 

## Background, Knetminer

* [Knetminer][4]
* Data infrastructure, [paper][6], [presentations][8], [article][9]
* [Data and DFW][2]

[2]: https://designingfuturewheat.org.uk/dfw-and-fair-agriculture-data-the-knetminer-experience/
[4]: https://www.biorxiv.org/content/10.1101/2020.04.02.017004v2
[6]: https://europepmc.org/article/med/30085931
[8]: https://www.slideshare.net/mbrandizi
[9]: https://knetminer.com/cases/the-power-of-standardised-and-fair-knowledge-graphs.html

## Background, BioLink Model
* [The web site](https://biolink.github.io/biolink-model/)

## General questions

### Common attributes
I've seen the `biolink:Attribute` class and a its subclasses. We have a number of common datavalue properties in Knetminer (see below), which we use to attach plain data values (numbers, strings) to nodes and relations. For instance: name, description, p-value, score, provenance, evidence. I couldn't find the equivalents of these in BioLink, should I look at some other part of the model?

### qualifiers and publications attached to Association
how is this managed in models like property graphs and Neo4j? Normally, in these models you can have two endpoints, a predicate identifiers and multiple attributes like strings or numbers, but you can't point at further nodes from a relation, the only way would be URI attributes or alike. Is it managed this way? Do we have many datasets using these things? 

### subclasses of Association
For example, let's take GeneToDiseaseAssociation. When generating data in BioLink format, should I always use this specific association? Or is it there for possible inference?

Namely, say I have genep53 encodes p53, would it be fine to say this is an instance of Association and then, possibly, some reasoner can entail that it's also a GeneToDiseaseAssociation (from the type of genep53)? Or should I detect what genep53 is and use the appropriate Association subtype?

The former case is easier to implement, the other is more complicated, cause Knetminer data aren't always so clean that we can always make entailments like above from the node type (ie, in some datasets, some relations falling under things like GeneRegulatoryRelationship haven't been defined using very standard vocabularies, so it's hard to recognise them).

## Degree of formality for certain relations
For instance, `biolink:has_participant` has occurrent as domain. In formal OWL ontologies, this entails that, for example, the particular P53 coming from a particular sample has participated to a particular reaction. But do you really model data this way? Knteminer has a has_participant property, but this links things like apoptosis, intended as the description of the process that can happen at some point and the concept of the protein named P53, which could have zillion of instances. In other words, we use this property to link the correspondent continuants, not their specific instances.


## Mappings, classes
* bk:Disease = biolink:Disease
* bk:Molecule < biolink:MolecularEntity
* bk:Compound = biolink:ChemicalSubstance
* bk:Drug = biolink:Drug
* bk:Metabolite = biolink:Metabolite
* bk:MoleculeComplex = biolink:MacromolecularComplex ?
* bk:Protein = biolink:Protein
* bk:Enzyme ? (biolink:Protein + biolink:qualifiers)
* bk:TF ? (biolink:Protein + biolink:qualifiers)
* bk:Process ? intended as reaction, transport, and other BioPax processes
* bk:Reaction
* bk:Transport
* bk:Experiment ? biolink:AdministrativeEntity + biolink:qualifiers
* bk:Tissue ? biolink:BiologicalEntity + biolink:qualifiers
* bk:OntologyTerms = biolink:OntologyClass TODO: subclasses
* bk:Path = biolink:Pathway ? (TODO: subclasses)
* bk:Publication = biolink:Publication
* bk:Gene = biolink:Gene
* bk:Treatment > biolink:Treatment (our treatment is general, not just exposure to substance)

## Mapping, relations (ie, object properties)
* bk:enc < biolink:has_gene_product (encodes, link a gene to the protein it expresses, or other molecular entities, eg, ncRNA, probably better to add a biolink:qualifiers)
* bk:en_by < biolink:produced_by (encoded by)
* bk:attributeUnit < biolink:QuantityValue (TODO: is it fine to entail an attribute is a QuantityValue too?)
* bk:asso_wi = biolink:related_to (this is associated_with)
* bk:cooc_wi < biolink:coexists_with (this is `co-occurs with` and is often used to match entities that co-occurs in the same publications
* bk:produces = biolink:produces
* bk:produced_by = biolink:produced_by
* bk:has_participant > biolink:has_participant (> because in our case the domain is more generic than occurent)
* bk:participates_in > biolink:participates_in (same as above)
* bk:occ_in = biolink:occurs_in
* bk:has_part = biolink:has_part
* bk:part_of = biolink:part_of
* bk:publication_features < biolink:related_to (this is subproperty of dc:subject, schema:about, probably needs a new property in biolink, or to use biolink:related_to + biolink:qualifiers)
* bk:related biolink:related_to
* bk:cs_by < biolink:contributes_to this is "consumed by", should be: subprop (inverse biolink:has_input)
* bk:consumed_by < biolink:contributes_to
* bk:consumes < biolink:has_input

## Mapping, attributes (ie, data properties)
* TODO: relevant ones are: title, description, comment, creation date, p-value, score, evidence (including evidence code), provenance. For most of them, I cannot find equivalents in biolink, we need real datasets and guidance from them.


## TODO
So far, I've gone in one direction only (bk->biolink). We need to check the other direction, to see if there are biolink entities that should be mapped in bk with additions.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mapping to BioLink Model #1

Mapping between BioKno and BioLink

Background, Knetminer

Background, BioLink Model

General questions

Common attributes

qualifiers and publications attached to Association

subclasses of Association

Degree of formality for certain relations

Mappings, classes

Mapping, relations (ie, object properties)

Mapping, attributes (ie, data properties)

TODO

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mapping to BioLink Model #1

Description

Mapping between BioKno and BioLink

Background, Knetminer

Background, BioLink Model

General questions

Common attributes

qualifiers and publications attached to Association

subclasses of Association

Degree of formality for certain relations

Mappings, classes

Mapping, relations (ie, object properties)

Mapping, attributes (ie, data properties)

TODO

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions