🐤💭 Tweet Sentiment Classification with GPT-2

This project fine-tunes a pre-trained transformer model, GPT-2, to classify tweet sentiments into three categories: Negative, Neutral, and Positive. The model was trained and evaluated on a labeled dataset of tweets, with the entire workflow executed in Google Colab using a Tesla T4 GPU to accelerate training and inference. The goal is to create a lightweight, accurate sentiment classifier that can be used to analyze social media content in real time.

👉 Try the live demo

🗃️ Repository Structure

gpt2-tweet-sentiment/
│
├── app/                                       # Local deployment
│   ├── main.py/                               # FastAPI backend for inference
│   └── frontend.py/                           # Gradio UI
|
├── data/                                      # Tweet dataset
│   ├── raw/                                   # Raw tweet data
│   └── processed/                             # Cleaned data
│
├── figures/                                   # Visualizations
│   └── gpt2-model-confusion-matrix.png        # Model confusion matrix
│
├── models/                                    # Trained GPT-2 models
│   └── gpt2-final-model/                      # Final saved model and tokenizer
│
├── notebooks/                                 # Notebooks
│   └── gpt2-finetune-tweet-sentiment.ipynb    # End-to-end pipeline
│
├── results/                                   # Model output
│   ├── metrics/                               # Evaluation results
│   │   └── gpt2-model-evaluation-metrics.txt
│   └── predictions/                           # Inference results
│       └── predictions_output.txt                       
│
├── config.py                                  # Google Drive & Colab folder setup
├── requirements.txt                           # Dependencies
└── README.md                                  # Project documentation

📘 Project Overview

Introduction – Fine-tuned GPT-2 to classify tweets into Negative, Neutral, and Positive sentiment categories.
Data Cleaning – Removed null values, duplicates, mentions, URLs, and extra whitespace for cleaner inputs.
Tokenization and Data Collation – Tokenized tweets using the GPT-2 tokenizer, with padding dynamically handled during batching.
Model Setup and Fine-Tuning – Loaded GPT2ForSequenceClassification with 3 output labels. Trained over 5 epochs using Hugging Face’s Trainer.
Training Configuration – Optimized training with batch size 8, learning rate 2e-5, and automatic model checkpointing and evaluation.
Evaluation Metrics – Used accuracy and weighted F1-score, with confusion matrix and classification report to analyze performance.
Inference Pipeline – Created a TextClassificationPipeline to predict sentiment from real tweets, along with confidence scores.
Conclusion – Delivered a robust sentiment analysis model ready for use in real-time applications like social media monitoring or customer feedback analysis.
Deployment – Deployed interactive web demo using Gradio available both locally and on Hugging Face Spaces.

📊 Dataset

This project uses the MTEB Tweet Sentiment Extraction dataset, hosted on Hugging Face Datasets. It contains labeled tweets categorized into three sentiment classes: Negative (0), Neutral (1), and Positive (2).

Source: MTEB Hugging Face
Total samples: 31015 tweets
Training set: 27481 tweets
Test set: 3534 tweets
Label distribution (test set): 1001 Negative, 1430 Neutral, and 1103 Positive tweets

🤔 Why GPT-2?

GPT-2 was selected to explore its effectiveness in sequence classification tasks even though it is primarily a generative model. Fine-tuning GPT-2 for sentiment classification provided an opportunity to experiment beyond the typical encoder-only models like BERT and see how a decoder-based architecture handles this type of problem.

⚙️ Dependencies

This project requires the following libraries:

pip install -r requirements.txt

Python
PyTorch
Transformers (Hugging Face)
Scikit-learn
FastAPI
Gradio
Requests

▶️ How to Run the Project

Option 1: Run Locally with GPU

Clone this repository:

git clone https://github.com/herrerovir/gpt2-tweet-sentiment

Navigate to the project directory:
```
cd gpt2-tweet-sentiment
```
Install the dependencies:
```
pip install -r requirements.txt
```
Run the notebook or script to train and test the model:
```
jupyter notebook
```

Option 2: Run on Google Colab (Recommended if no GPU locally)

Open a new Google Colab notebook.

Clone the repository inside the notebook:

!git clone https://github.com/herrerovir/gpt2-tweet-sentiment

Navigate to the cloned folder and open the notebook gpt2-finetune-tweet-sentiment.ipynb.
Switch runtime to GPU Tesla T4 for faster training.
Follow the notebook to fine-tune GPT-2 and perform inference.

📂 Model Files

The trained model files are not included in this repository due to their large size. The fine-tuned GPT-2 model is saved to your Google Drive under models/gpt2/gpt2-final-model. It includes model weights and tokenizer files for easy loading.

Additionally, the fine-tuned model is publicly hosted and available for download at the Hugging Face Model Hub: 👉 See the model in Hugging Face Hub

📊 Model Performance

After training for 5 epochs, the GPT-2 model achieved:

Metric	Score
Accuracy	79.37%
F1 Score	79.34%
Eval Loss	0.6867

Class Performance Breakdown

Class	Precision	Recall	F1-Score
Negative	0.81	0.78	0.79
Neutral	0.76	0.76	0.76
Positive	0.82	0.85	0.84

These results show the model generalizes well and maintains balance across all sentiment classes.

🔮 Inference Examples

The model accurately classifies tweet sentiments with confidence scores:

Input: "The food was hot and delicious." Prediction: Positive (Confidence: 99.93%)
Input: "Ugh, my flight got delayed again." Prediction: Negative (Confidence: 99.95%)
Input: "Heading to the grocery store, then back to work." Prediction: Neutral (Confidence: 99.57%)
Input: "Lost all my work because of a crash. Fantastic." Prediction: Positive (Confidence: 52.06%) ⚠️ (sarcasm not detected)

These highlight both the strengths and limitations of the model, especially when sarcasm is involved.

📋 Results

The GPT-2 model proves effective for sentiment classification on social media text. With nearly 80% accuracy and F1 score, and consistent per-class performance, it's a strong baseline for real-world applications. It performs especially well on clearly positive or negative tweets, but can be improved to better detect sarcasm or subtle tones.

🌐 Deployment Options

You can interact with the tweet sentiment classifier via a web interface using either local deployment or a cloud-hosted app on Hugging Face Spaces.

Option 1: Run Locally with FastAPI + Gradio

Install dependencies

From the root directory of the repository, run:

pip install -r requirements.txt

Start the FastAPI backend

Open a terminal, navigate to the app folder, and run the FastAPI app:

cd app
uvicorn main:app --reload

This launches the backend server at: http://127.0.0.1:8000
The FastAPI backend serves model inference endpoints.

Start the Gradio frontend

Open a new terminal window, stay inside the app directory, and run:

python frontend.py

This launches the Gradio UI on: http://localhost:7860
The frontend calls the FastAPI backend for predictions.

Use the web app:

Open your browser and go to:

http://localhost:7860

You’ll see the interactive web app where you can enter tweets and receive sentiment predictions instantly.

Option 2: Try It on Hugging Face Spaces (No Setup Required)

You can also test the model live in your browser via the Hugging Face Space:

👉 Try the Live Demo on Hugging Face Spaces

No installation or GPU required, just open the link and start analyzing tweet sentiments instantly.

🙌 Acknowledgments

Built with Hugging Face Transformers, PyTorch, and Scikit-learn. Trained using free GPU resources via Google Colab.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🐤💭 Tweet Sentiment Classification with GPT-2

🗃️ Repository Structure

📘 Project Overview

📊 Dataset

🤔 Why GPT-2?

⚙️ Dependencies

▶️ How to Run the Project

Option 1: Run Locally with GPU

Option 2: Run on Google Colab (Recommended if no GPU locally)

📂 Model Files

📊 Model Performance

Class Performance Breakdown

🔮 Inference Examples

📋 Results

🌐 Deployment Options

Option 1: Run Locally with FastAPI + Gradio

Option 2: Try It on Hugging Face Spaces (No Setup Required)

🙌 Acknowledgments

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
app		app
data		data
figures		figures
notebooks		notebooks
results		results
.gitignore		.gitignore
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt

herrerovir/gpt2-tweet-sentiment

Folders and files

Latest commit

History

Repository files navigation

🐤💭 Tweet Sentiment Classification with GPT-2

🗃️ Repository Structure

📘 Project Overview

📊 Dataset

🤔 Why GPT-2?

⚙️ Dependencies

▶️ How to Run the Project

Option 1: Run Locally with GPU

Option 2: Run on Google Colab (Recommended if no GPU locally)

📂 Model Files

📊 Model Performance

Class Performance Breakdown

🔮 Inference Examples

📋 Results

🌐 Deployment Options

Option 1: Run Locally with FastAPI + Gradio

Option 2: Try It on Hugging Face Spaces (No Setup Required)

🙌 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages