Findmyreviewers

Findmyreviewers (FMR in short) is an open-source project that extracts topics from a piece of text using trained LDA models and tries to find best matching scholars from a pool of scholars.

Under the hood, it uses LDA models to extract topics and tries to find a set of best matches of reviewers.

The web app is built on top of flask and the LDA model is trained with gensim. With slight modification, you can also use other libraries to replace gensim and load your own trained LDA model.

Installation

Make sure your Python version is 3.6.x.

Environment

Using virtualenv is highly recommended.

If you do not have a virtual environment yet on the project folder, set it up with:

$ virtualenv venv

Then activate the virtual environment:

$ source venv/bin/activate

Install packages:

$ pip install -r requirements.txt

Download demo models:

$ cd trained
$ python download.py
$ cd ..

Install NLTK data:

$ python -m nltk.downloader brown
$ python -m nltk.downloader punkt

Running the server

Initialize web app database:

$ python manage.py create_table

Run the web app server:

$ python manage.py runserver

Then after navigate to the following address:

127.0.0.1:5000

To access the dashboard, please visit:

127.0.0.1:5000/dashboard

Customization and Development

We have a rough documents available in the /docs folder.

You can also checkout an online version at http://findmyreviewers.readthedocs.io.

There are also some jupyter notebooks in the /tutorial folder. They cover:

How we preprocess the data
How we trained the model
How the matching algorithm is developed

Plan

We will keep refining the project as well as the documentation.

Currently we are looking at:

Refining the preprocessing procedures
Refining LDA model training
Implementing Author-topics model

Demo Model and Databases

A trained demo LDA model and a demo database is shipped with this repository.

The LDA model is trained with our complete full text corpus (tons of pdfs). It retains all the states and data you need to further train it with new documents.

The demo database is a portion of our complete database, as the data sources do not allow us to reveal the data.

Therefore, the matching results from our demo database may seem sub-optimal because the lack of complete data.

Acknowledgements

To focus on more important stuff, we make use of several open-source libraries and projects. We sincerely appreciate their works.

Python Libraries

gensim
nltk
TextBlob
flask (and several extensions)

etc.

Web

Frontend template: https://freehtml5.co/elate-free-html5-bootstrap-template/

Dashboard template: https://github.com/puikinsh/gentelella

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
app		app
continuous_integration		continuous_integration
core		core
databases		databases
docs		docs
tests		tests
trained		trained
tutorials		tutorials
utilities		utilities
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
CONTRIBUTORS.md		CONTRIBUTORS.md
LICENSE		LICENSE
README.md		README.md
auth0.env		auth0.env
config.py		config.py
keyword_dbs.env		keyword_dbs.env
lda_models.env		lda_models.env
manage.py		manage.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Findmyreviewers

Installation

Environment

Running the server

Customization and Development

Plan

Demo Model and Databases

Acknowledgements

Python Libraries

Web

Sponsors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Findmyreviewers

Installation

Environment

Running the server

Customization and Development

Plan

Demo Model and Databases

Acknowledgements

Python Libraries

Web

Sponsors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages