Skip to content

๐ŸŽฏ Masha'erohom: State-of-the-art Arabic sentiment analysis achieving 90% accuracy using ensemble deep learning (Bi-LSTM + Bi-GRU + MARBERTv2 + Random Forest). Published research with web application for real-time emotion detection.

License

Notifications You must be signed in to change notification settings

aliabdallah7/Masha-erohom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

44 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

ู…ุดุงุนุฑู‡ู… ู„ูˆุฌูˆ

๐ŸŽฏ Masha'erohom (Arabic Sentiment Analysis)

Python Seaborn NumPy Pandas Anaconda Flask TensorFlow PyTorch Scikit-learn Transformers


An end-to-end Arabic Emotion Detection Web Application that leverages ensemble deep learning to accurately classify emotions expressed in Arabic text. The system is designed to handle Modern Standard Arabic (MSA) and multiple Arabic dialects, achieving 90% accuracy through a stacking ensemble of state-of-the-art models.

๐Ÿ“‹ Table of Contents

๐Ÿš€ Project Overview

With the massive growth of Arabic content on social media platforms, understanding public emotions has become both critical and challenging. Arabic sentiment and emotion analysis is particularly difficult due to:

  • Complex morphology
  • Dialectal diversity
  • Limited high-quality annotated datasets

Mashaerohom addresses this gap by providing a production-ready web application that automatically detects emotions in Arabic text using a powerful ensemble of deep learning models.

The system supports batch analysis (CSV upload) and real-time analysis (Twitter keyword search), accompanied by interactive visual dashboards.

Key Achievement: 90% Accuracy - State-of-the-art performance on Arabic emotion detection

โœจ Key Features

๐Ÿ” Dual Functionality

  1. CSV File Processing

    • Upload Excel/CSV files containing Arabic text
    • Automatic emotion classification
    • Download enriched dataset with emotion labels
    • Interactive dashboard with visualizations
  2. Real-time Twitter Analysis

    • Search for keywords on Twitter
    • Retrieve 50 most recent tweets
    • Real-time sentiment analysis
    • Visual sentiment distribution

๐Ÿ“Š Visualization Dashboard

  • Interactive pie charts and bar graphs
  • Percentage distribution tables
  • Real-time sentiment tracking
  • Historical data analysis

๐Ÿ—๏ธ Architecture

Three-Tier System Design

Architecture.mp4

Model Pipeline

Input โ†’ Preprocessing โ†’ [Bi-LSTM, Bi-GRU, MARBERTv2] โ†’ Stacking โ†’ Random Forest โ†’ Output

๐Ÿค– Models & Performance

Ensemble Architecture

We combine three powerful models using Random Forest stacking:

Model Description Key Features
Bi-LSTM Bidirectional Long Short-Term Memory Captures long-term dependencies, bidirectional context
Bi-GRU Bidirectional Gated Recurrent Unit Efficient, computationally lighter than LSTM
MARBERTv2 Arabic-specific BERT model Pre-trained on 1B Arabic tweets, 128GB text data
Random Forest Ensemble Meta-learner Combines predictions from all base models

Performance Metrics

Model Accuracy F1-Score Recall Precision
Bi-GRU 72% 71% 70% 72%
Bi-LSTM 72% 71% 71% 72%
MARBERTv2 81% 80% 79% 81%
Ensemble (RF) 90% 90% 90% 90%

Note: Ensemble model outperforms all individual models significantly!

๐Ÿ“Š Dataset

Emotone_ar Dataset

  • Size: 10,065 Arabic tweets
  • Emotions: 8 categories (Sadness, Anger, Joy, Surprise, Love, Sympathy, Fear, None)
  • Dialects: Multiple Arabic dialects
  • Annotation: Manually annotated by 3 native Arabic speakers
  • Balance: Approximately equal distribution across emotions

Sample Distribution:

image

๐Ÿ“ˆ Results & Comparisons

State-of-the-Art Comparison

Name Date Dataset Model Accuracy F1-score
Text Based Emotion Recognition in Arabic text 2019 Emotone-AR[4] CNN 0.70 0.70
Textual Emotions 2023 Emotone-AR[4] BI-GRU 0.73 0.74
Improved Emotion Detection Framework for Arabic Text using Transformer Models 2023 Emotone-AR[4] arabic-bert-base model 0.74 0.74
Masha'erohom 2024 Emotone-AR[4] BI-LSTM 0.72 0.71
Masha'erohom 2024 Emotone-AR[4] BI-GRU 0.72 0.71
Masha'erohom 2024 Emotone-AR[4] MARBERT 0.81 0.80
Masha'erohom 2024 Emotone-AR[4] Ensemble RF 0.90 0.90

Confusion Matrix (Ensemble Model)

Actual \ Predicted Ang Fea Joy Lov Sad Sym Sur Non
Ang 1296 24 18 12 45 21 15 9
Fea 31 1082 35 18 28 9 2 0
Joy 22 19 1150 32 42 8 7 0
Lov 15 8 29 1120 21 15 5 0
Sad 38 21 45 25 1050 32 33 10
Sym 28 12 18 21 35 920 12 0
Sur 19 5 12 8 28 14 945 13
Non 21 8 15 12 35 18 25 1405

Emotion Abbreviations:

  • Ang: Anger
  • Fea: Fear
  • Joy: Joy
  • Lov: Love
  • Sad: Sadness
  • Sym: Sympathy
  • Sur: Surprise
  • Non: None

๐Ÿ”ฎ Future Work

Planned Enhancements

Model Improvements

  • Integrate AraT5 for better text understanding
  • Add dialect-specific models
  • Implement sarcasm and irony detection

Feature Expansion

  • Facebook keyword search integration
  • Multi-platform social media analysis
  • Real-time streaming analysis

Technical Upgrades

  • Docker containerization
  • Cloud deployment (AWS/Azure)
  • Mobile application
  • API rate limiting and scaling

Dataset Expansion

  • Include more Arabic dialects
  • Add news articles and blogs
  • Cross-domain sentiment analysis

๐Ÿ‘ฅ Team

Supervisors

  • Dr. Shaimaa Haridy - Lecturer, Information Systems Department, Ain Shams University

Development Team

  • Ali Abdallah
  • Mohamed Ali
  • Karima Sobhi
  • Ali Maher
  • Abdulthman Abdelhalim
  • Hany Mohamed

Institution

This project was developed as a Bachelorโ€™s Graduation Project at:

Faculty of Computer and Information Sciences, Information Systems Department
Ain Shams University
Cairo, Egypt

๐Ÿ“š Citation

If you use this project in your research, please cite:

@article{arabicsentiment2024,
  title={Arabic Sentiment Analysis using Ensemble Deep Learning Model},
  author={Ali Abdallah, Mohamed Ali, Karima Sobhi, Ali Maher, Abdelrahman Abdelhalim, Hany Mohamed},
  year={2024},
  publisher={Ain Shams University},
  note={Bachelor's Graduation Project}
}

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgements

We acknowledge with gratitude:

๐ŸŽ“ Supervisory Excellence: Dr. Shaimaa Haridy for her exceptional mentorship, research guidance, and unwavering support that transformed this project into published research.

๐Ÿ›๏ธ Institutional Support: Ain Shams University for academic resources and Nile University for the Emotone_ar dataset.

โšก Technical Enablement: Hugging Face, Google Colab, and Kaggle for providing the tools and computational power essential for this deep learning research.

๐ŸŒ Open Source Community: Countless contributors whose work forms the foundation of modern NLP research.

Gratitude turns what we have into enough, and research into impact.


โญ If you find this project useful, please give it a star on GitHub!

๐Ÿ“ง Contact: For questions or collaborations, please email: [email protected]

๐Ÿ”— Live Demo:

Demo Video

๐Ÿ“– Documentation: Full Documentation

๐Ÿ“š Publication

Published Paper

Title: Arabic Sentiment Analysis using Ensemble Deep Learning Model
Journal: International Journal of Intelligent Computing and Information Sciences
DOI/Link: https://ijicis.journals.ekb.eg/article_406786.html
Status: โœ… Published

About

๐ŸŽฏ Masha'erohom: State-of-the-art Arabic sentiment analysis achieving 90% accuracy using ensemble deep learning (Bi-LSTM + Bi-GRU + MARBERTv2 + Random Forest). Published research with web application for real-time emotion detection.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •