Hello,
I have some questions regarding SCIDOCS subset data from scirepeval benchmark.
First, it seems that when I have tried to access several queries and candidates' abstract and title data using huggingface scidocs_view_cite_read data from scidocs_view and scidocs_read, some of them don't seem to exist in scidocs_view_cite_read_data, which contains document id, its title and corresponding abstracts. Since paper details do not exist in scidocs_view and scidocs_read split, I tried to access document details using the document id in scidocs_view and scidocs_read split and match them with ones in scidocs_view_cite_read_data. I was curious if I am using the dataset in a right way, or whether the dataset was originally constructed in such way. It seems that scidocs_cite and scidocs_cocite have perfect matches, and scidocs_view, scidocs_read are the splits that seem to be problematic.
Secondly, it seems that several candidates do not contain abstract and only have the titles. I am trying to make sure that such setting is okay.
Sincerely
Thank You.
Hello,
I have some questions regarding SCIDOCS subset data from scirepeval benchmark.
First, it seems that when I have tried to access several queries and candidates' abstract and title data using huggingface scidocs_view_cite_read data from scidocs_view and scidocs_read, some of them don't seem to exist in scidocs_view_cite_read_data, which contains document id, its title and corresponding abstracts. Since paper details do not exist in scidocs_view and scidocs_read split, I tried to access document details using the document id in scidocs_view and scidocs_read split and match them with ones in scidocs_view_cite_read_data. I was curious if I am using the dataset in a right way, or whether the dataset was originally constructed in such way. It seems that scidocs_cite and scidocs_cocite have perfect matches, and scidocs_view, scidocs_read are the splits that seem to be problematic.
Secondly, it seems that several candidates do not contain abstract and only have the titles. I am trying to make sure that such setting is okay.
Sincerely
Thank You.