Welcome to Movie Review Sentiment Analysis, a web application that predicts the sentiment (positive or negative) of movie reviews using a deep learning model. Built with Streamlit and TensorFlow, this app leverages an LSTM model trained on the IMDB dataset to provide accurate sentiment predictions with confidence scores. Deployed on Streamlit Cloud, it's user-friendly and perfect for movie enthusiasts and developers alike! 🎥
- Sentiment Prediction: Analyze any movie review and get instant sentiment predictions (Positive/Negative) with confidence scores.
- Interactive UI: Navigate between Home and Prediction pages with a sleek sidebar menu.
- Example Reviews: Test the model with pre-loaded example reviews in the sidebar.
- Analysis History: View and download your analysis history as a Markdown file.
- Responsive Design: Styled with custom CSS for a modern, professional look.
- Cloud Deployment: Hosted on Streamlit Cloud for easy access.
- Python 3.8 or higher
- Git
- Streamlit Cloud account (for deployment)
- Jupyter Notebook (to run the training script)
-
Clone the Repository:
git clone https://github.com/rohanmistry231/sentiment-analysis-app.git cd sentiment-analysis-app
-
Install Dependencies: Create a virtual environment and install the required packages:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt
-
Prepare Model Files: The app requires
model.h5
andtokenizer.pkl
to function. These files are generated by training the model using the provided Jupyter Notebook:- Open
IMDB_reviews_Sentiment_Analysis_LSTM.ipynb
in Jupyter Notebook. - Run all cells to train the LSTM model on the
IMDB_Dataset.csv
dataset. - The notebook will save
model.h5
(the trained model) andtokenizer.pkl
(the tokenizer) in the project root.
Note: If you already have pre-trained
model.h5
andtokenizer.pkl
files, you can skip this step. - Open
-
Run the App Locally:
streamlit run app.py
Open your browser and go to
http://localhost:8501
.
-
Push to GitHub:
- Create a new repository on GitHub.
- Push your project files:
git add . git commit -m "Initial commit" git push origin main
-
Set Up Streamlit Cloud:
- Log in to Streamlit Cloud.
- Click "New App" and connect your GitHub repository.
- Select the branch (e.g.,
main
) and specifyapp.py
as the main file. - Click "Deploy".
-
Verify Deployment:
- Once deployed, Streamlit Cloud will provide a URL (e.g.,
https://sentiment-analysis-app.streamlit.app
). - Open the URL to test the app.
- Once deployed, Streamlit Cloud will provide a URL (e.g.,
Note: If model.h5
is too large for GitHub, host it on Google Drive or AWS S3 and modify app.py
to download it during runtime.
-
Navigate to the Home Page:
- Learn about the app's features and how to use it.
-
Go to the Prediction Page:
- Use the sidebar to select "Prediction".
- Enter a movie review in the text area.
- Click "Analyze Sentiment" to get the prediction.
-
Try Example Reviews:
- In the sidebar, click on "Example 1", "Example 2", or "Example 3" to test pre-loaded reviews.
-
View and Download History:
- Expand the "View Analysis History" section to see past analyses.
- Click "Download History" to save your analysis history as a Markdown file.
sentiment-analysis-app/
├── app.py # Main Streamlit app
├── model.h5 # Trained LSTM model
├── tokenizer.pkl # Tokenizer for text preprocessing
├── IMDB_Dataset.csv # IMDB dataset for training
├── IMDB_reviews_Sentiment_Analysis_LSTM.ipynb # Jupyter Notebook for model training
├── requirements.txt # Dependencies
├── README.md # Project documentation
Listed in requirements.txt
:
streamlit==1.39.0
tensorflow==2.17.0
numpy==1.26.4
streamlit-option-menu==0.4.1
For training the model (in the Jupyter Notebook), additional dependencies may be required, such as:
pandas
scikit-learn
kaggle
(for downloading the dataset)
Install them with:
pip install -r requirements.txt
To run the notebook, install additional dependencies:
pip install pandas scikit-learn kaggle jupyter
The dataset used for training the model is IMDB_Dataset.csv
, which contains 50,000 movie reviews from the IMDB dataset. Each review is labeled as either "positive" or "negative". The dataset is balanced, with 25,000 reviews per class.
- Source: Originally sourced via the Kaggle API (as shown in the notebook).
- Columns:
review
: The text of the movie review.sentiment
: The label (positive
ornegative
).
- Usage: The dataset is used in
IMDB_reviews_Sentiment_Analysis_LSTM.ipynb
to train the LSTM model.
If you need to re-download the dataset, the notebook includes instructions for using the Kaggle API. You'll need a Kaggle account and API token (kaggle.json
).
- Dataset: IMDB dataset of 50k movie reviews (
IMDB_Dataset.csv
). - Model Architecture: LSTM with an Embedding layer, trained using TensorFlow.
- Training: 5 epochs, batch size of 64, validation split of 0.2.
- Preprocessing: Tokenized and padded reviews to a maximum length of 200, using a vocabulary of 5000 words.
The model and tokenizer are saved as model.h5
and tokenizer.pkl
, respectively.
The IMDB_reviews_Sentiment_Analysis_LSTM.ipynb
notebook contains the complete workflow for training the sentiment analysis model. It includes:
- Data Collection: Downloads the IMDB dataset using the Kaggle API.
- Preprocessing: Tokenizes and pads the reviews, converts labels to binary (0 for negative, 1 for positive).
- Model Building: Defines an LSTM model with an Embedding layer and Dense output.
- Training: Trains the model for 5 epochs with validation.
- Evaluation: Evaluates the model on a test set.
- Saving: Saves the trained model as
model.h5
and the tokenizer astokenizer.pkl
.
- Install Jupyter Notebook and dependencies:
pip install jupyter pandas scikit-learn kaggle tensorflow
- Launch Jupyter Notebook:
jupyter notebook
- Open
IMDB_reviews_Sentiment_Analysis_LSTM.ipynb
and run all cells. - Ensure
kaggle.json
is set up for downloading the dataset (instructions in the notebook).
This project is licensed under the MIT License. See the LICENSE file for details.
Contributions are welcome! Follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature
). - Make your changes and commit (
git commit -m "Add your feature"
). - Push to the branch (
git push origin feature/your-feature
). - Open a Pull Request.
For questions or feedback, reach out to [email protected] or open an issue on GitHub.
Happy Analyzing! 🎥