Skip to content

GOLEM-lab/Qidian_Webnovel_DataCollection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Qidian-Webnovel-Corpus

An brief introduction to the Qidian-Webnovel Corpus.

The corpus creation process involved a manual search for translated novels available on Webnovel.com within the NOVEL category, only targeting completed works.

Subsequently, each identified translated novel was mapped with its original counterpart on Qidian.com.

As a result, we got 120 novels (from which 10 of these novels' copyright on Qidian.com had expired, so no data for this 10 novels on Qidian)

The final corpus consists of 110 novels, and all the reader comments and replies to the novels. (Timestamp 01/09/2024)

Comments and replies are catergorised by book-level, chapter-level and paragraph level, and stored by per novel.

For example:

We also collected the user profiles of readers who has left comments or replies on the novels. We only collect personal data that are necessary for the purpose of the scientific research, and strictly abid by the GDPR.

Due to the Data Regulation and Policy, for the moment, we cannot share the comment/replies and user profile data through this Github repository. However, we are more than happy to share the data with any reserachers that are also interested in this topic.

For more information about the dataset, you can check it on https://dataverse.nl/dataset.xhtml?persistentId=doi%3A10.34894%2FGQXX3K&version=DRAFT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published