Skip to content

iareARiES/Restaurant-Review-analysis-NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🍽️ Restaurant Review Sentiment Analysis πŸ“

πŸ“– Important Note

For a better understanding of the project, please check the Google Colab file πŸ“„ uploaded in this repository. It contains detailed explanations and execution steps to help you grasp the workflow more effectively.

This repository contains a sentiment analysis project using Natural Language Processing (NLP) and a Naive Bayes classifier to classify restaurant reviews as positive πŸ‘ or negative πŸ‘Ž.

πŸ” Overview

  • πŸ“‚ The dataset consists of restaurant reviews stored in a TSV file.
  • 🧹 Text preprocessing is performed to clean and prepare the data.
  • πŸ“Š A Bag of Words (BoW) model is used to convert text data into numerical format.
  • πŸ€– A Naive Bayes classifier is trained on the dataset to perform sentiment classification.
  • πŸ“ˆ Model evaluation is done using a confusion matrix and accuracy score.

πŸ› οΈ Technologies Used

  • 🐍 Python
  • πŸ—‚οΈ Pandas
  • πŸ”’ NumPy
  • πŸ“‰ Matplotlib
  • πŸ“ NLTK (Natural Language Toolkit)
  • πŸ€– Scikit-learn

βš™οΈ Installation

Ensure you have Python installed and set up a virtual environment (optional but recommended).

  1. πŸš€ Clone this repository:
    git clone https://github.com/yourusername/restaurant-review-nlp.git
    cd restaurant-review-nlp
  2. πŸ“¦ Install dependencies:
    pip install -r requirements.txt
  3. πŸ“₯ Download the necessary NLTK stopwords:
    import nltk
    nltk.download('stopwords')

▢️ Usage

Run the script to preprocess the dataset, train the Naive Bayes model, and evaluate performance:

python sentiment_analysis.py

πŸ“Š Dataset

The dataset used is Restaurant_Reviews.tsv, which contains:

  • πŸ—£οΈ A column Review with customer reviews.
  • βœ… A column Liked (1 for positive, 0 for negative sentiment).

πŸ—οΈ Steps in the Code

  1. πŸ“₯ Load Dataset: Read the Restaurant_Reviews.tsv file.
  2. 🧼 Text Cleaning & Preprocessing:
    • Remove special characters, convert text to lowercase.
    • Remove stopwords (except negations like "not").
    • Apply stemming using PorterStemmer.
  3. πŸ“Š Feature Extraction:
    • Use CountVectorizer to create a Bag of Words model.
    • Convert text into a numerical matrix representation.
  4. πŸ“‘ Train-Test Split:
    • 80% training, 20% testing.
  5. πŸ€– Train Model:
    • Train a Multinomial Naive Bayes classifier.
  6. πŸ“ˆ Evaluate Model:
    • Predict test data.
    • Compute accuracy score and confusion matrix.

πŸ“Š Model Performance

The script prints:

  • 🟩 Confusion matrix for training and test datasets.
  • 🎯 Accuracy score of the classifier.

🀝 Contribution

Feel free to fork this repository, submit issues, and contribute with improvements! πŸš€

πŸ“œ License

This project is open-source and available under the MIT License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published