Skip to content

Missing OTTs result in missing images and links in the Extinct tree #103

Open
@davidebbo

Description

Here is the state of OTT matching with the current extinct tree:

Over all input files, 73.849% names were precisely matched, 4.358% matched via synonyms, and 21.792% have no match

And note that synonyms are not always reliable. For instance, Sphenacodon (a Synapsid) is treated as a synonym of Epigonus (a fish), which seems utterly random.

This results in major issues with the tree:

  • Missing images
  • Missing links, or link going to the wrong place when you click on the leaf

Now, thinking about solving this...

In the full tree, we start with OTT, and then use the taxonomy to get ncbi, gbif, irmng, then the provider ID CSV to get the EOL IDs, and finally the wiki dump to get QIDs.

But in the Extinct case, we start with wikipedia, and essentially get everything from there. So this complex chain is unnecessary, and results in bad behavior due to the missing OTTs.

One solution to this is to simply move away from OTTs, and instead use QIDs as our primary identifiers. To do this in a non-disruptive way that doesn't require many core code changes, we can just pretend that QIDs are OTTs. So in the DB, we'd just put QIDs wherever OTTs are used today, without even changing the schema.

In the ordered_leaves table, this would effectively mean that the ott and wikidata columns would have the same value.

The reason I think this will work is that OneZoom mostly uses the OTT as a unique and stable ID for taxa, but doesn't really ever do anything that expects it to be a true OTT.

Bottom line is that we can potentially make all this work with zero changes to the core OneZoom code base. We only need to change the Extinct TreeBuild logic to create the database in this way.

I have not tried this, so let's discuss whether this might run into unexpected downsides.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    extinct treeIssues involving extinct species

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions