This project aims to do items recommendation in an e-commerce system. Its major purpose is the "Gestione Avanzata dell'Informazione 2" exam ("Advanced Information Management 2" is a course about data analytics, big data, NoSQL models, text analytics and graph analytics).
Start by having a look at the presentation.
The project is composed of several scripts to:
- inspect the available data in a MongoDB
- extract the needed data from a MongoDB
- apply a collaborative-filtering / collaborative-ranking approach to the data
- apply a content-based approach to the data
- merge the recommendations of the two approaches into a single list of item to recommend
- plot a performance comparison between the graph-based and the content-based approach
Please take a look here if you re interested in how the huge size of the "Amazon reviews" dataset from SNAP has been reduced.
First of all start a clean MongoDB instance; then in order to import the compressed test db dump into your MongoDB you can simply run:
$ mongorestore --archive ./shrinked_test_db.mongodump.gz \
$ --gzip \
$ --objcheck \
$ --verbose \
$ -j 8
You can find a table showing the amount of items of each category in the db before and after the original db reduction.
Licensed under GPLv3+. Full text available here.