Skip to content

dsanyal/NLP_Toxic_comment_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Toxic Comment Classification Challenge: Kaggle

The dataset consists of a large set of user comments from Wikipedia’s talk page edits. Some of the users, unfortuantely resort to threats, and abusive behaviour and racism to get their point across. Each of the comments in this dataset is labelled as toxic, severee toxic, obscene, threat, identity hate, or clean.

This is a classic text classification problem, where one is given a corpus of text documents with labelled classes, and one has to perform supervised machine learning to first train the data, and then classify unseen documents. This will help the mods to hopefully weed out hateful comments in the future from the comment section with the help of machine learning.

Link to the Kaggle competition page

About

Jigsaw Toxic comment classification contest hosted by Kaggle. Text classification + topic modelling

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published