Skip to content

Wikilist-Extraction/wikipedia-list-extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wikipedia-list-extraction

Extract lists and tables from wikipedia and add their information to DBpedia.

Installation

Make sure you have Java 1.8, scala and sbt installed.

  1. Clone repo
  2. Install jena CLI
  • on OS X you can run brew install jena
  • on other platforms you need to install them as described here
  1. Then run scripts/loadDumps.sh, optionally you can update the preloaded typeCounts with scripts/typeCount.sh.
  2. Download or create a wiki-markup xml dump. Downloads from special:export work just fine.
  3. Convert it to a json dump with scripts/convert.sh.
  4. Copy src/main/resources/application.conf-default to src/main/resources/application.conf, there you need to change the input filename accordingly to your generated dump file and you can change the parameters of the algorithm.
  5. To start the application run sbt run and choose GenerateTypes as main class.

About

Extract lists and tables from wikipedia and add their information to DBpedia

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •