Skip to content

Latest commit

 

History

History
14 lines (11 loc) · 646 Bytes

File metadata and controls

14 lines (11 loc) · 646 Bytes

Inverted-Index-And-full-text-search-Implementation-with-Pyspark-and-mysql

This project is an Implementation of Full Text Search with Invereted Index which is stored in mysql database. It is implemented using pyspark. This project was tested in google colab.

The entire project is divided into 6 steps
-Step 0 - Setup Environment and Import Packages
-Step 1 - Create Invereted Index and doc magnitude and store it in a file
-Step 2 - Store it in a mysql
-Step 3 - Lookup Inverted Index and get metrics
-Step 4 - Calculate Cosine Similarity -Step 5 - Document Ranking

#Steps to run the code unzip "input_docs.zip" and run the .ipnb