Skip to content

crossref problems #99

@robjhyndman

Description

@robjhyndman

From Kurt:

Some R Journal publications have their crossref metadata not quite
right.  The one I've recently run into is
<https://journal.r-project.org/articles/RJ-2022-020>, for which

R> x <- rcrossref::cr_cn("10.32614/RJ-2022-020", "rdf-xml")
R> vapply(xml2::xml_find_all(x, "//j.0:creator//j.3:name"), as.character, "")
[1] "<j.3:name>Allison M. Horst</j.3:name>"     
[2] "<j.3:name>Kristen B. Gorman</j.3:name>"   
[3] "<j.3:name>Alison Presmanes Hill</j.3:name>"

looks ok, but

R> vapply(xml2::xml_find_all(x, "//j.0:creator//j.3:familyName"), as.character, "")
[1] "<j.3:familyName>M. Horst</j.3:familyName>"     
[2] "<j.3:familyName>B. Gorman</j.3:familyName>"     
[3] "<j.3:familyName>Presmanes Hill</j.3:familyName>"

R> vapply(xml2::xml_find_all(x, "//j.0:creator//j.3:givenName"), as.character, "")
[1] "<j.3:givenName>Allison</j.3:givenName>"
[2] "<j.3:givenName>Kristen</j.3:givenName>"
[3] "<j.3:givenName>Alison</j.3:givenName>"

shows that the first two have their second given name (initial) end up
in the family name.  Interestingly, the bibtex shown on
<https://journal.r-project.org/articles/RJ-2022-020/#citation> is ok (in
the sense that bibtex::read.bib gets it right ... so something must be
wrong with the code the generates the xml for the crossref registration.

I cannot say how often this occurs: however, e.g. for

  <https://journal.r-project.org/articles/RJ-2022-005/#citation>

clearly even the bibtex is wrong.

I tried the effect of fetching the metadata for all RJ-2022 papers,
using

**********************************************************************
x <- jsonlite::fromJSON("https://journal.r-project.org/articles/articles.json")
p <- basename(x$path)
p <- p[startsWith(p, "RJ-2022")]
x <- rcrossref::cr_cn(paste0("10.32614/", p), "rdf-xml")
names(x) <- p
x <- x[order(names(x))]
one <- function(e) {
    paste(xml2::xml_text(xml2::xml_find_first(e, "./j.3:givenName")),
          xml2::xml_text(xml2::xml_find_first(e, "./j.3:familyName")),
          sep = " @@@ ")
}
fun <- function(e) lapply(xml2::xml_find_all(e, "//j.3:Person"), one)
aut <- lapply(x, fun)
lapply(aut, unlist, use.names = FALSE)
**********************************************************************

which gives the attached output, suggesting that at least for 2022 we
always have the problem with the second given name initial.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions