📧 Spam Email Detection using NLP & Machine Learning

This notebook-based project demonstrates how to detect spam emails using Natural Language Processing and classic Machine Learning algorithms.

💡 Built entirely in a Jupyter Notebook using email.csv as the dataset.

🔍 What’s Inside

Exploratory Data Analysis (EDA) of spam vs. ham messages
Data cleaning and duplicate removal
Feature engineering: character count, word count, sentence count
Text preprocessing using NLTK (tokenization, stopword removal, stemming)
Label encoding (Spam = 0, Ham = 1)
Vectorization using CountVectorizer and TF-IDF
Model training with:
- Logistic Regression
- Support Vector Machine (SVM)
- Random Forest
- Decision Trees
- Naive Bayes
- AdaBoost, Bagging, Gradient Boosting
Model comparison using accuracy, classification report, and confusion matrix

🛠 Libraries Used

Python 3.x
pandas, numpy
nltk
matplotlib, seaborn
scikit-learn

🧪 How to Run

Open the notebook in Jupyter or Google Colab
Make sure email.csv is present in the same directory
Run the notebook cells step-by-step
You'll see preprocessing, training, and evaluation all inside one file

📁 Files

Spam_Email_Detection/

├── Spam_Email_Detection.ipynb

└── email.csv

🔑 Keywords

NLP · Spam Classification · Email Filtering · TF-IDF · CountVectorizer · Scikit-learn · NLTK · Logistic Regression · Random Forest · Text Preprocessing · Model Evaluation · Python

📌 Note

All processing and training steps are performed inside the Jupyter Notebook itself — no external scripts or setup required.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📧 Spam Email Detection using NLP & Machine Learning

🔍 What’s Inside

🛠 Libraries Used

🧪 How to Run

📁 Files

🔑 Keywords

📌 Note

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

📧 Spam Email Detection using NLP & Machine Learning

🔍 What’s Inside

🛠 Libraries Used

🧪 How to Run

📁 Files

🔑 Keywords

📌 Note