Skip to content

Get more reliable ways to detect DOI or other precise information about a paper without using full-paper GPT request #91

@markwhiting

Description

@markwhiting

To help us deduplicate and do various other things, it would be great to have a lot of certainty about the formal identifiers of a paper as cheaply as possible. For example can we get a paper's DOI with high reliability (e.g., even when the paper shares the title, filename, authors or other properties with other papers in our corpus).

We currently make requests to services like crossref, altmetric or openAlex for this but even those require estimating things. So we might want some general purpose feature that aims to find and check some basic bibliometrics more robustly.

I suspect this will need some iteration inside, e.g., if title matches, does abstract, or DOI or other stuff? How far do we go, and how many instances of disagreement prove to us that things are different? etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions