Advanced NLP tool for extracting emotional insights from video and audio content
Transform unstructured video and audio content into meaningful emotional analytics using our state-of-the-art NLP pipeline. Built with DeBERTa models and deployed on Azure ML, this system provides dual-mode prediction - choose between fast local inference or high-accuracy cloud processing with automatic NGROK URL conversion (no VPN required). Perfect for content analysis, customer sentiment tracking, and research applications.
- Fast inference - On-premise processing
- No network dependency - Works offline
- Privacy-first - Data stays local
- Low latency - Instant results
./
├── .github/ # GitHub Actions CI/CD workflows
├── .azuremlignore # Azure ML ignore patterns
├── .dockerignore # Docker build ignore patterns
├── .flake8 # Python linting configuration
├── .gitignore # Git ignore rules
├── .pre-commit-config.yaml # Pre-commit hook configurations
├── assets/ # Static assets (images, logos, screenshots)
├── data/ # Datasets and data processing
├── dist/ # Distribution files (build artifacts)
├── docs/ # Sphinx documentation
├── environment/ # Environment configurations
├── frontend/ # React.js web application
├── logs/ # Application and system logs
├── mlruns/ # MLflow experiment tracking
├── models/ # Machine learning models and artifacts
├── monitoring/ # Infrastructure monitoring
├── notebooks/ # Jupyter notebooks for exploration
├── outputs/ # Generated outputs and artifacts
├── results/ # Experiment results and analysis
├── src/ # Main source code
│ └── emotion_clf_pipeline/ # Core Python package
│ ├── __init__.py # Package initialization
│ ├── api.py # FastAPI web service
│ ├── azure_endpoint.py # Azure ML endpoint integration
│ ├── azure_hyperparameter_sweep.py # HPT on Azure ML
│ ├── azure_pipeline.py # Azure ML pipeline orchestration
│ ├── azure_score.py # Azure ML scoring functions
│ ├── azure_sync.py # Azure ML synchronization
│ ├── cli.py # Command-line interface
│ ├── data.py # Data loading and preprocessing
│ ├── features.py # Feature engineering
│ ├── model.py # DeBERTa model architecture
│ ├── monitoring.py # System monitoring and metrics
│ ├── predict.py # Prediction pipeline
│ ├── stt.py # Speech-to-text processing
│ ├── train.py # Model training pipeline
│ ├── transcript.py # Transcript processing
│ ├── transcript_translator.py # Multi-language transcript support
│ └── translator.py # Text translation utilities
├── tests/ # Comprehensive test suite
├── docker-compose.yml # Multi-container orchestration
├── docker-compose.build.yml # Build-specific container config
├── Dockerfile # Backend container configuration
├── LICENSE # MIT license
├── poetry.lock # Poetry dependency lock file
├── pyproject.toml # Python project configuration (Poetry)
├── start-build.bat # Windows build script
├── start-production.bat # Windows production deployment
└── README.md # This comprehensive documentation- Python 3.11+
- Docker (recommended)
- Poetry for dependency management
git clone https://github.com/BredaUniversityADSAI/2024-25d-fai2-adsai-group-nlp6.git
cd 2024-25d-fai2-adsai-group-nlp6Create .env file in the project root:
# Required API Keys
ASSEMBLYAI_API_KEY="your_assemblyai_key"
GEMINI_API_KEY="your_gemini_key"
# Azure ML Configuration (Optional - for cloud predictions)
AZURE_SUBSCRIPTION_ID="your_subscription_id"
AZURE_RESOURCE_GROUP="buas-y2"
AZURE_WORKSPACE_NAME="NLP6-2025"
AZURE_LOCATION="westeurope"
AZURE_TENANT_ID="your_tenant_id"
AZURE_CLIENT_ID="your_client_id"
AZURE_CLIENT_SECRET="your_client_secret"
# Azure ML Endpoint (automatically converts private URLs to NGROK)
AZURE_ENDPOINT_URL="http://194.171.191.227:30526/api/v1/endpoint/deberta-emotion-clf-endpoint/score"
AZURE_API_KEY="your_azure_endpoint_key"Complete full-stack deployment:
docker-compose up --buildAccess:
• Frontend: http://localhost:3121
• API: http://localhost:3120
Dual-Mode API: Choose between fast local inference or high-accuracy cloud prediction.
Local Prediction (Fast, on-premise):
# cURL example
curl -X POST "http://localhost:3120/predict" \
-H "Content-Type: application/json" \
-d '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "method": "local"}'
# PowerShell example (Windows)
Invoke-RestMethod -Uri "http://localhost:3120/predict" -Method POST \
-ContentType "application/json" \
-Body '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "method": "local"}'Azure Prediction (High-accuracy, cloud-based with automatic NGROK conversion):
# cURL example
curl -X POST "http://localhost:3120/predict" \
-H "Content-Type: application/json" \
-d '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "method": "azure"}'
# PowerShell example (Windows)
Invoke-RestMethod -Uri "http://localhost:3120/predict" -Method POST \
-ContentType "application/json" \
-Body '{"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "method": "azure"}'Python SDK Examples:
import requests
# Local prediction (fast)
response = requests.post(
"http://localhost:3120/predict",
json={"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "method": "local"}
)
emotions = response.json()
# Azure prediction (high-accuracy, automatic NGROK conversion)
response = requests.post(
"http://localhost:3120/predict",
json={"url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ", "method": "azure"}
)
emotions = response.json()This section provides an overview of the system's architecture and data flow.
This diagram illustrates the main components of the Emotion Classification Pipeline and how they interact, including user interfaces, backend services, external dependencies, and data storage.
graph TD
subgraph User Interaction
UI[Browser - React Frontend]
CLI[Command Line Interface]
CURL[cURL/Postman]
end
subgraph Backend Services [Emotion Classification Pipeline API - FastAPI]
API[api.py - Endpoints /predict, /health]
PRED[predict.py - Orchestration Logic]
DATA[data.py - Data Handling]
MODEL[model.py - Emotion Model]
end
subgraph External Services
YT[YouTube API/Service]
ASSEMBLY[AssemblyAI API]
WHISPER[Whisper Model - Local/HuggingFace]
end
subgraph Data Storage
DS_AUDIO[Local File System: /data/youtube_audio]
DS_TRANS[Local File System: /data/transcripts]
DS_RESULTS[Local File System: /data/results]
end
UI --> API
CLI --> PRED
CURL --> API
API --> PRED
PRED --> DATA
PRED --> MODEL
PRED --> ASSEMBLY
PRED --> WHISPER
DATA --> YT
DATA --> DS_AUDIO
ASSEMBLY --> DS_TRANS
WHISPER --> DS_TRANS
MODEL --> DS_RESULTS
classDef userStyle fill:#C9DAF8,stroke:#000,stroke-width:2px,color:#000
class UI,CLI,CURL userStyle
classDef backendStyle fill:#D9EAD3,stroke:#000,stroke-width:2px,color:#000
class API,PRED,DATA,MODEL backendStyle
classDef externalStyle fill:#FCE5CD,stroke:#000,stroke-width:2px,color:#000
class YT,ASSEMBLY,WHISPER externalStyle
classDef storageStyle fill:#FFF2CC,stroke:#000,stroke-width:2px,color:#000
class DS_AUDIO,DS_TRANS,DS_RESULTS storageStyle
This sequence diagram details the process from a user submitting a YouTube URL to receiving the emotion analysis results. It highlights the interactions between the frontend, backend API, prediction service, data handling, transcription, and the emotion model.
sequenceDiagram
actor User
participant Frontend_UI as Frontend UI (React)
participant Backend_API as FastAPI Backend (api.py)
participant PredictionService as Prediction Service (predict.py)
participant DataHandler as Data Handler (data.py)
participant TranscriptionService as Transcription (AssemblyAI/Whisper)
participant EmotionModel as Emotion Model (model.py)
participant FileSystem as Local File System (data/*)
User->>Frontend_UI: Inputs YouTube URL
Frontend_UI->>Backend_API: POST /predict (URL)
activate Backend_API
Backend_API->>PredictionService: process_youtube_url_and_predict(URL)
activate PredictionService
PredictionService->>DataHandler: save_youtube_audio(URL)
activate DataHandler
DataHandler-->>FileSystem: Saves audio.mp3
DataHandler-->>PredictionService: Returns audio_file_path
deactivate DataHandler
PredictionService->>TranscriptionService: Transcribe(audio_file_path)
activate TranscriptionService
TranscriptionService-->>FileSystem: Saves transcript.xlsx/json
TranscriptionService-->>PredictionService: Returns transcript_data (text, timestamps)
deactivate TranscriptionService
PredictionService->>EmotionModel: predict_emotion(transcript_sentences)
activate EmotionModel
EmotionModel-->>PredictionService: Returns emotion_predictions (emotion, sub_emotion, intensity)
deactivate EmotionModel
PredictionService-->>FileSystem: Saves results.xlsx (optional)
PredictionService-->>Backend_API: Formatted JSON with predictions
deactivate PredictionService
Backend_API-->>Frontend_UI: JSON Response
deactivate Backend_API
Frontend_UI->>User: Displays emotional analysis
This diagram shows the primary Python modules within the src/emotion_clf_pipeline package and their main dependencies, focusing on the prediction pathway.
graph LR
subgraph src/emotion_clf_pipeline
A[api.py]
B[cli.py]
C[predict.py]
D[model.py]
E[data.py]
F[train.py] -- Not directly in /predict flow --> D
end
A --> C
B --> C
C --> D
C --> E
D --> E
classDef moduleStyle fill:#E6E6FA,stroke:#333,stroke-width:2px,color:#000
class A,B,C,D,E,F moduleStyle
| Component | Technology | Purpose |
|---|---|---|
| Frontend | React.js | Interactive web interface |
| API | FastAPI | REST endpoints and validation |
| ML Pipeline | DeBERTa + PyTorch | Emotion classification |
| Speech Processing | AssemblyAI / Whisper | Audio transcription |
| Cloud Platform | Azure ML | Training & deployment |
There are two ways to interact with the code. To either process them on premise or on cloud. Below you can see a comprehensive guideline on how to use various commands on both option.
Data preprocessing: Preprocess the data and save them in the specified location.
python -m emotion_clf_pipeline.cli preprocess --verbose --raw-train-path "data/raw/train" --raw-test-path "data/raw/test/test_data-0001.csv"Train and evaluate: Train the model and evaluate it on various data splits, which includes model syncing with Azure model (i.e., first downloading the best model from azure, and finally registering the weight to Azure model):
python -m emotion_clf_pipeline.cli train --epochs 15 --learning-rate 1e-5 --batch-size 16Prediction: There are various methods when it comes to get the prediction:
# Option 1 - API (Dual-Mode: Local or Azure)
uvicorn src.emotion_clf_pipeline.api:app --host 0.0.0.0 --port 3120 --reload # Start backend api
# Local prediction API call:
Invoke-RestMethod -Uri "http://127.0.0.1:3120/predict" -Method POST -ContentType "application/json" -Body '{"url": "YOUTUBE-LINK", "method": "local"}'
# Azure prediction API call (with automatic NGROK conversion):
Invoke-RestMethod -Uri "http://127.0.0.1:3120/predict" -Method POST -ContentType "application/json" -Body '{"url": "YOUTUBE-LINK", "method": "azure"}'
# Option 2 - CLI
python -m emotion_clf_pipeline.cli predict "YOUTUBE-LINK"
# Option 3 - Docker container (backend only)
docker build -t emotion-clf-api .
docker run -p 3120:80 emotion-clf-api
# Option 4 - Docker compose (both frontend and backend)
docker-compose up --buildData preprocessing job: It takes the data from 'emotion-raw-train' and 'emotion-raw-test' and then registered the final preprocessed data into 'emotion-processed-train' and 'emotion-processed-test'
poetry run python -m emotion_clf_pipeline.cli preprocess --azure --register-data-assets --verboseTraining job: It takes the preprocessed data and train the model using them, evaluate them, and finally register the weights.
poetry run python -m emotion_clf_pipeline.cli train --azure --verboseFull pipeline: This is the combination of data and train pipeline from above.
poetry run python -m emotion_clf_pipeline.cli pipeline --azure --verboseScheduled pipeline: This command create a schedule for the full pipeline on the specified time schedule.
python -m src.emotion_clf_pipeline.cli schedule create --schedule-name 'scheduled-deberta-full-pipeline' --daily --hour 0 --minute 0 --enabled --mode azureHyperparameter tunning sweep: This create multiple sweeps for doing hyperparameter tunning.
poetry run python -m emotion_clf_pipeline.hyperparameter_tuningPrediction: We can make a prediction on Azure ML Endpoint using this command.
python -m emotion_clf_pipeline.cli predict "YOUTUBE-LINK" --use-azure
python -m emotion_clf_pipeline.cli predict "https://youtube.com/watch?v=VIDEO_ID" --use-azure --use-ngrok# Set up development environment
poetry install
poetry run pre-commit install
# Run quality checks
poetry run pre-commit run --all-files
poetry run pytest -vTo ensure consistent collaboration and traceability, all branches should follow the naming convention:
<type>/<sprint>-<scope>-<action>
Example: feature/s2-data-add-youtube-transcript
Type Prefixes:
| Prefix | Description |
|---|---|
feature |
New functionality |
fix |
Bug fixes |
test |
Unit/integration testing |
docs |
Documentation updates |
config |
Environment or dependency setup |
chore |
Maintenance and cleanup |
refactor |
Code restructuring |
- Create a feature branch
- Make your changes
- Submit a pull request
- Wait for code review and approval
# All tests
poetry run pytest -v
# Specific test types
poetry run pytest tests/unit -v
poetry run pytest tests/integration -v
# With coverage
poetry run coverage run -m pytest
poetry run coverage report
poetry run coverage html- Unit Tests: Test individual components in isolation
- Integration Tests: Test component interactions
- API Tests: Test REST endpoint functionality
For managing large model files, configure Git LFS:
# Install and initialize Git LFS
git lfs install
# Track model files
git lfs track "models/*"
# Commit LFS configuration
git add .gitattributes && git commit -m "Configure Git LFS"# Pre-commit hooks
poetry run pre-commit install
poetry run pre-commit run --all-files
# Linting
poetry run flake8 src/
poetry run black src/This project is licensed under the MIT License.
