-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Hi,
I noticed a small discrepancy between epmc_hits() and epmc_search(): with an identical query, they return a different number of found records. Do these functions manipulate query arguments differently? epmc_hits() systematically underestimates the number of records, but it matches the number of records returned on the EuropePMC website.
In my script, I use epmc_hits() to set a limit for epmc_search(), i.e. download all records. However, with that difference in function behavior, I will systematically lose some records.
Here is reprex:
single <- "crispr"
europepmc::epmc_hits(query = single)
#> [1] 74467
europepmc::epmc_search(query = single)
#> 75450 records found, returning 100
#> # A tibble: 100 x 28
#> id source pmid pmcid doi title authorString journalTitle journalVolume
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3413~ MED 3413~ PMC8~ 10.1~ A si~ Chen J, Sch~ MicroPubl B~ 2021
one_letter <- "calyculin a" #also might be "mytomicin c"
europepmc::epmc_hits(query = one_letter)
#> [1] 3951
europepmc::epmc_search(query = one_letter)
#> 3971 records found, returning 100
#> # A tibble: 100 x 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3406~ MED 3406~ PMC8~ 10.3~ Eval~ Zastko L, R~ Int J Mol S~ 11
two_words <- "physical activity" #also might be "cancer cells"
europepmc::epmc_hits(query = two_words)
#> [1] 1075740
europepmc::epmc_search(query = two_words)
#> 3407174 records found, returning 100
#> # A tibble: 100 x 29
#> id source pmid doi title authorString journalTitle issue journalVolume
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 3338~ MED 3338~ 10.1~ Effe~ Willinger N~ J Phys Act ~ 1 18
Metadata
Metadata
Assignees
Labels
No labels