Skip to content

Different number of records returned #39

@Dobrokhotov1989

Description

@Dobrokhotov1989

Hi,
I noticed a small discrepancy between epmc_hits() and epmc_search(): with an identical query, they return a different number of found records. Do these functions manipulate query arguments differently? epmc_hits() systematically underestimates the number of records, but it matches the number of records returned on the EuropePMC website.
In my script, I use epmc_hits() to set a limit for epmc_search(), i.e. download all records. However, with that difference in function behavior, I will systematically lose some records.

Here is reprex:

single <- "crispr"
europepmc::epmc_hits(query = single)
#> [1] 74467
europepmc::epmc_search(query = single)
#> 75450 records found, returning 100
#> # A tibble: 100 x 28
#>    id    source pmid  pmcid doi   title authorString journalTitle journalVolume
#>    <chr> <chr>  <chr> <chr> <chr> <chr> <chr>        <chr>        <chr>        
#>  1 3413~ MED    3413~ PMC8~ 10.1~ A si~ Chen J, Sch~ MicroPubl B~ 2021         


one_letter <- "calyculin a" #also might be "mytomicin c"
europepmc::epmc_hits(query = one_letter)
#> [1] 3951
europepmc::epmc_search(query = one_letter)
#> 3971 records found, returning 100
#> # A tibble: 100 x 28
#>    id    source pmid  pmcid doi   title authorString journalTitle issue
#>    <chr> <chr>  <chr> <chr> <chr> <chr> <chr>        <chr>        <chr>
#>  1 3406~ MED    3406~ PMC8~ 10.3~ Eval~ Zastko L, R~ Int J Mol S~ 11   


two_words <- "physical activity" #also might be "cancer cells"
europepmc::epmc_hits(query = two_words)
#> [1] 1075740
europepmc::epmc_search(query = two_words)
#> 3407174 records found, returning 100
#> # A tibble: 100 x 29
#>    id    source pmid  doi   title authorString journalTitle issue journalVolume
#>    <chr> <chr>  <chr> <chr> <chr> <chr>        <chr>        <chr> <chr>        
#>  1 3338~ MED    3338~ 10.1~ Effe~ Willinger N~ J Phys Act ~ 1     18           

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions