Skip to content

Sai2002Praneeth/cyberbullying-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Cyberbullying Detection - Analysis and Model Training 🔬

This repository contains the Jupyter Notebook detailing the data analysis, preprocessing, and model training process for the Cyberbullying Detection project.

Live Application Demo 🚀

A live, deployed version of this project is running on Hugging Face Spaces.

➡️ Try the Live App Here!


Overview ℹ️

This repository documents the steps taken to build the machine learning model that powers the final web application. The primary focus is the development process, from raw data to a trained and evaluated classifier.


The Jupyter Notebook 📓

The main file in this repository is Cyberbullying.ipynb. This notebook includes:

  • Data Loading and Cleaning: Importing the dataset of over 40,000 comments and performing initial preprocessing.
  • Exploratory Data Analysis (EDA): Visualizing the distribution of the data.
  • Text Preprocessing: Detailed steps for cleaning the text, including tokenization, stop word removal, and lemmatization using NLTK.
  • Feature Extraction: Using the TF-IDF (Term Frequency-Inverse Document Frequency) method to convert text into numerical features.
  • Model Training and Evaluation: Training and comparing multiple classifiers to select the best one based on performance metrics.

Final Model 🤖

The final model chosen was the Stochastic Gradient Descent (SGD) Classifier, which achieved an accuracy of 87% on the test set.


Deployed Application Repository ➡️

The clean, deployed code for the live Gradio application can be found in the cyberbullying-app repository.

Releases

No releases published

Packages

 
 
 

Contributors

Languages