A machine learning model to accurately classify news articles as real or fake. Mitigating the spread of misinformation on digital platforms.
Unmasking Fake News with ML.
Approach:
- Designed a hybrid CNN-Bidirectional LSTM model for effective feature extraction and sequence learning.
- Employed Binary Cross-Entropy as the loss function and optimized using the Adam optimizer.
- Achieved key metrics: Accuracy, Precision, Recall, F1 Score, and ROC AUC Score of 0.97.
Data Preprocessing:
- Loaded datasets ('train.tsv' and 'test.tsv') and performed exploratory analysis using Pandas.
- Cleaned data by removing stopwords and applying token filtering with NLTK and Gensim.
- Tokenized and padded data for uniform input using TensorFlow.
- Split data into training (80%) and validation (20%) sets.
Model Architecture:
- Used an Embedding Layer to convert words into dense vector representations.
- Prevented overfitting using a Dropout Layer.
- Extracted local features via Conv1D and MaxPooling Layers.
- Captured sequence context using a Bidirectional LSTM Layer.
- Output predictions with a Dense Layer.
Results : Achieved 97% accuracy in classifying news articles as real or fake.