-
Notifications
You must be signed in to change notification settings - Fork 8
Setting up Extraction libraries
Nithin Krishna edited this page Aug 10, 2016
·
1 revision
#Apache tika
- Download and build tika locally.
- Run tika's HTTP server on port 9998
java -jar tika-server/target/tika-server-1.13-SNAPSHOT.jar 9998
#Stanford Core NLP NER
- Needs to be downloaded from here.
- Environment variable STANFORD_MODELS needs to be set, should point to the model files downloaded.
#Grobid Quantities
- Needs to be downloaded setup and trained as described here.
- Start the service at 8080.
mvn -Dmaven.test.skip=true jetty:run-war
#Python dependencies
- Install python dependencies listed in the
dependencies.txt
file. - Install nltk data.
Information Retrieval and Data Science (IRDS) research group, University of Southern California.