Toxic Tweet Classification (Classical ML Baselines)

A machine learning project focused on detecting toxic vs non-toxic tweets using custom NLP models.
The goal of this project is not only classification performance, but understanding model behavior under severe class imbalance and making data-driven decisions using appropriate evaluation metrics.

Problem Statement

Online toxicity detection is a challenging NLP task due to:

informal language
sarcasm and context dependence
severe class imbalance

This project evaluates multiple baseline models and emphasizes minority-class performance rather than misleading aggregate accuracy.

Dataset

Total tweets: 31,962 real-world Twitter posts
Non-toxic (0): 29,720 (~93%)
Toxic (1): 2,242 (~7%)

The dataset is highly imbalanced, making accuracy an unreliable metric for model selection.

Models Evaluated

The following classical machine learning models were trained and compared:

Logistic Regression
Naive Bayes
Support Vector Machine (SVM)

All models were trained using sparse text representations (TF-IDF).

Evaluation Strategy

Due to class imbalance, evaluation focused on:

Confusion matrices
Class-wise precision, recall, and F1-score
Minority-class (toxic) recall and F1
Threshold tuning using predicted probabilities (Logistic Regression)

Accuracy was reported but not used for model selection.

Model Comparison & Key Findings

Logistic Regression and SVM performed best overall due to:
- calibrated probability outputs
- stable behavior on sparse, high-dimensional text features
- flexibility in threshold tuning under class imbalance
Naive Bayes underperformed due to its conditional independence assumption, which is poorly suited to contextual toxicity detection.

Key Takeaways

Accuracy may be misleading for imbalanced classification tasks
Simple, well-tuned models can perform competitively for NLP classification tasks
Model selection should be driven by problem-specific costs, not single metrics

Technologies Used

Python
scikit-learn
NumPy
Pandas
matplotlib
seaborn

Named Entity Recognition (NER) Analysis

To better understand linguistic patterns in toxic content, spaCy NER was applied to tweets labeled as toxic.

Entity extraction revealed that the most frequent entity types in toxic tweets were:

PERSON
ORG
CARDINAL
MONEY

This suggests that toxic language frequently targets individuals and organizations, and often references numerical or monetary contexts, which may correlate with harassment, threats, or disputes.

NER analysis was used for exploratory insight and interpretability, not for classification features.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
toxic-tweets-classification (1).ipynb		toxic-tweets-classification (1).ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toxic Tweet Classification (Classical ML Baselines)

Problem Statement

Dataset

Models Evaluated

Evaluation Strategy

Model Comparison & Key Findings

Key Takeaways

Technologies Used

Named Entity Recognition (NER) Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Toxic Tweet Classification (Classical ML Baselines)

Problem Statement

Dataset

Models Evaluated

Evaluation Strategy

Model Comparison & Key Findings

Key Takeaways

Technologies Used

Named Entity Recognition (NER) Analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages