Skip to content

Description A Python-based web application for analyzing sentiment in movie reviews using NLP techniques and machine learning models. Built with Flask and libraries like NLTK or TextBlob, it provides a user-friendly interface for sentiment classification.

License

Notifications You must be signed in to change notification settings

rohanmistry231/Movie-Sentiment-Analysis-App

Repository files navigation

🎬 Movie Review Sentiment Analysis

Streamlit
TensorFlow
Python
License

Welcome to Movie Review Sentiment Analysis, a web application that predicts the sentiment (positive or negative) of movie reviews using a deep learning model. Built with Streamlit and TensorFlow, this app leverages an LSTM model trained on the IMDB dataset to provide accurate sentiment predictions with confidence scores. Deployed on Streamlit Cloud, it's user-friendly and perfect for movie enthusiasts and developers alike! 🎥


🚀 Features

  • Sentiment Prediction: Analyze any movie review and get instant sentiment predictions (Positive/Negative) with confidence scores.
  • Interactive UI: Navigate between Home and Prediction pages with a sleek sidebar menu.
  • Example Reviews: Test the model with pre-loaded example reviews in the sidebar.
  • Analysis History: View and download your analysis history as a Markdown file.
  • Responsive Design: Styled with custom CSS for a modern, professional look.
  • Cloud Deployment: Hosted on Streamlit Cloud for easy access.

🛠️ Setup and Installation

Prerequisites

  • Python 3.8 or higher
  • Git
  • Streamlit Cloud account (for deployment)
  • Jupyter Notebook (to run the training script)

Local Setup

  1. Clone the Repository:

    git clone https://github.com/rohanmistry231/sentiment-analysis-app.git
    cd sentiment-analysis-app
  2. Install Dependencies: Create a virtual environment and install the required packages:

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    pip install -r requirements.txt
  3. Prepare Model Files: The app requires model.h5 and tokenizer.pkl to function. These files are generated by training the model using the provided Jupyter Notebook:

    • Open IMDB_reviews_Sentiment_Analysis_LSTM.ipynb in Jupyter Notebook.
    • Run all cells to train the LSTM model on the IMDB_Dataset.csv dataset.
    • The notebook will save model.h5 (the trained model) and tokenizer.pkl (the tokenizer) in the project root.

    Note: If you already have pre-trained model.h5 and tokenizer.pkl files, you can skip this step.

  4. Run the App Locally:

    streamlit run app.py

    Open your browser and go to http://localhost:8501.


🌐 Deploy on Streamlit Cloud

  1. Push to GitHub:

    • Create a new repository on GitHub.
    • Push your project files:
      git add .
      git commit -m "Initial commit"
      git push origin main
  2. Set Up Streamlit Cloud:

    • Log in to Streamlit Cloud.
    • Click "New App" and connect your GitHub repository.
    • Select the branch (e.g., main) and specify app.py as the main file.
    • Click "Deploy".
  3. Verify Deployment:

    • Once deployed, Streamlit Cloud will provide a URL (e.g., https://sentiment-analysis-app.streamlit.app).
    • Open the URL to test the app.

Note: If model.h5 is too large for GitHub, host it on Google Drive or AWS S3 and modify app.py to download it during runtime.


📖 Usage

  1. Navigate to the Home Page:

    • Learn about the app's features and how to use it.
  2. Go to the Prediction Page:

    • Use the sidebar to select "Prediction".
    • Enter a movie review in the text area.
    • Click "Analyze Sentiment" to get the prediction.
  3. Try Example Reviews:

    • In the sidebar, click on "Example 1", "Example 2", or "Example 3" to test pre-loaded reviews.
  4. View and Download History:

    • Expand the "View Analysis History" section to see past analyses.
    • Click "Download History" to save your analysis history as a Markdown file.

📁 Project Structure

sentiment-analysis-app/
├── app.py                              # Main Streamlit app
├── model.h5                            # Trained LSTM model
├── tokenizer.pkl                       # Tokenizer for text preprocessing
├── IMDB_Dataset.csv                    # IMDB dataset for training
├── IMDB_reviews_Sentiment_Analysis_LSTM.ipynb  # Jupyter Notebook for model training
├── requirements.txt                    # Dependencies
├── README.md                           # Project documentation

📦 Dependencies

Listed in requirements.txt:

  • streamlit==1.39.0
  • tensorflow==2.17.0
  • numpy==1.26.4
  • streamlit-option-menu==0.4.1

For training the model (in the Jupyter Notebook), additional dependencies may be required, such as:

  • pandas
  • scikit-learn
  • kaggle (for downloading the dataset)

Install them with:

pip install -r requirements.txt

To run the notebook, install additional dependencies:

pip install pandas scikit-learn kaggle jupyter

📊 Dataset

The dataset used for training the model is IMDB_Dataset.csv, which contains 50,000 movie reviews from the IMDB dataset. Each review is labeled as either "positive" or "negative". The dataset is balanced, with 25,000 reviews per class.

  • Source: Originally sourced via the Kaggle API (as shown in the notebook).
  • Columns:
    • review: The text of the movie review.
    • sentiment: The label (positive or negative).
  • Usage: The dataset is used in IMDB_reviews_Sentiment_Analysis_LSTM.ipynb to train the LSTM model.

If you need to re-download the dataset, the notebook includes instructions for using the Kaggle API. You'll need a Kaggle account and API token (kaggle.json).


🧠 Model Details

  • Dataset: IMDB dataset of 50k movie reviews (IMDB_Dataset.csv).
  • Model Architecture: LSTM with an Embedding layer, trained using TensorFlow.
  • Training: 5 epochs, batch size of 64, validation split of 0.2.
  • Preprocessing: Tokenized and padded reviews to a maximum length of 200, using a vocabulary of 5000 words.

The model and tokenizer are saved as model.h5 and tokenizer.pkl, respectively.


📜 Model Training

The IMDB_reviews_Sentiment_Analysis_LSTM.ipynb notebook contains the complete workflow for training the sentiment analysis model. It includes:

  • Data Collection: Downloads the IMDB dataset using the Kaggle API.
  • Preprocessing: Tokenizes and pads the reviews, converts labels to binary (0 for negative, 1 for positive).
  • Model Building: Defines an LSTM model with an Embedding layer and Dense output.
  • Training: Trains the model for 5 epochs with validation.
  • Evaluation: Evaluates the model on a test set.
  • Saving: Saves the trained model as model.h5 and the tokenizer as tokenizer.pkl.

Running the Notebook

  1. Install Jupyter Notebook and dependencies:
    pip install jupyter pandas scikit-learn kaggle tensorflow
  2. Launch Jupyter Notebook:
    jupyter notebook
  3. Open IMDB_reviews_Sentiment_Analysis_LSTM.ipynb and run all cells.
  4. Ensure kaggle.json is set up for downloading the dataset (instructions in the notebook).

📝 License

This project is licensed under the MIT License. See the LICENSE file for details.


🤝 Contributing

Contributions are welcome! Follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/your-feature).
  3. Make your changes and commit (git commit -m "Add your feature").
  4. Push to the branch (git push origin feature/your-feature).
  5. Open a Pull Request.

📬 Contact

For questions or feedback, reach out to [email protected] or open an issue on GitHub.


Happy Analyzing! 🎥

About

Description A Python-based web application for analyzing sentiment in movie reviews using NLP techniques and machine learning models. Built with Flask and libraries like NLTK or TextBlob, it provides a user-friendly interface for sentiment classification.

Topics

Resources

License

Stars

Watchers

Forks