Skip to content

Latest commit

Β 

History

History
84 lines (59 loc) Β· 2.23 KB

File metadata and controls

84 lines (59 loc) Β· 2.23 KB
title emoji colorFrom colorTo sdk app_port tags pinned short_description license
NLPDisasterTweets
🚨
red
gray
docker
8501
streamlit
false
App to classify tweets as real disasters (1) or non-disaster
mit

🚨 NLP Disaster Tweet Classifier β€” TF-IDF + Logistic Regression

A production-ready NLP text classification project that detects whether a tweet describes a real disaster event or not β€” deployed as an interactive Streamlit app.

Built as a fast, explainable classical ML baseline using TF-IDF and Logistic Regression with threshold tuning. This app classifies tweets as Disaster (1) or Non-Disaster (0).

πŸ”— Live Demo & Code

πŸ€— HuggingFace Space: [https://huggingface.co/spaces/EnYa32/NLPDisasterTweets]

πŸ’» GitHub Repository: [https://github.com/EnYa32/NLPDisasterTweets]

πŸ““ Kaggle Competition: [https://www.kaggle.com/code/enesyama/nlp-disastertweets]

What this project does

Many tweets contain disaster-related words but do not describe real disasters.
This project trains a classical NLP model to detect real disaster tweets. Many tweets contain disaster-related keywords (fire, flood, explosion) but are used in a metaphorical or casual context.

Goal: Build a model that distinguishes:

1 = real disaster event 0 = non-disaster tweet

Model

  • Text features: TF-IDF (unigrams + bigrams)
  • Classifier: Logistic Regression
  • Metric: F1-score
  • Threshold tuning: optimized on validation set (threshold = 0.43)
  • Best validation F1-score: ~0.78

πŸ“Š EDA Visualizations

Class Balance Text Length Top Keywords

NLP Preprocessing Pipeline

Applied consistently in training and inference:

lowercase

punctuation removal

whitespace normalization

keyword + location enrichment

TF-IDF transform

Files in this repo

Place these files in the repository root:

  • app.py
  • final_clf.pkl (trained pipeline: TF-IDF + Logistic Regression)
  • threshold.pkl (float threshold, e.g. 0.43)
  • requirements.txt
  • README.md

▢️ Run Locally

pip install -r requirements.txt streamlit run app.py

Notes This is a classical ML baseline (fast + strong).

The same artifacts can be reused in any deployment setting.