Skip to content

maitreepatel1110/NLP_Hinglish_Sentiment_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 

Repository files navigation

NLP_Hinglish_Sentiment_Analysis

Hinglish Sentiment Classification using Hierarchical DeBERTa-v3

A scalable and robust sentiment classification model designed for code-mixed Hinglish (Hindi-English) text using a custom Hierarchical DeBERTa-v3 architecture. The model handles long, noisy social media inputs using chunk-wise attention and focuses on class imbalance with Focal Loss.


πŸš€ Features

  • βœ… Transformer Backbone: microsoft/deberta-v3-base
  • βœ… Hierarchical attention over chunked sentences
  • βœ… Focal Loss for class imbalance
  • βœ… Text augmentation using TextAttack
  • βœ… Custom Hinglish stopword removal
  • βœ… Metrics: Accuracy, Precision, Recall, F1
  • βœ… Optimized training loop with early stopping, gradient accumulation, warmup, and weight decay

πŸ“Š Results

Metric Score
Accuracy 78.76%
Precision 78.14%
Recall 78.76%
F1 Score 78.13%

πŸ“ Dataset

  • FinalTrainingOnly.csv β€” CSV with columns text and label
  • stop_hinglish.txt β€” Custom stopword list for Hinglish

Model Overview

The model uses DeBERTa-v3 to embed input chunks, processes them through MultiheadAttention, and classifies the aggregated representation.

Architecture Base: microsoft/deberta-v3-base Chunking: Long sequences split into fixed-size chunks (e.g., 64 tokens) Attention: Multi-head attention across chunk embeddings Classifier: Dropout β†’ WeightNorm Linear β†’ GELU β†’ LayerNorm β†’ Linear

Training

Train the model using: model = HierarchicalDeBERTa(...) trainer( model, num_epochs, optimizer, loss_fn, lr, train_dataloader, test_dataloader, device ) Checkpoints are saved as best_model.pth when validation accuracy improves

Evaluation

Evaluate using: eval(model, test_dataloader, device) Returns accuracy, precision, recall, and F1-score.

Dependencies

torch transformers pandas scikit-learn textattack tqdm

License

This project is released under the MIT License.

Acknowledgements

Microsoft DeBERTa TextAttack HuggingFace Transformers πŸ”— Connect

πŸ“§ [email protected]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages