Skip to content

detecting emotions from your webcam or any image using deepface

Notifications You must be signed in to change notification settings

amyaby/AI_emotion_recognition

Repository files navigation

Facial Emotion Recognition Project — Learning Phase

Table of Contents

  1. STEP 1 — Load & Display an Image
  2. Libraries Used
  3. Key Concepts Explanation
  4. Recap & Next Steps
  5. STEP 2 — Detect Faces with MediaPipe
  6. STEP 3 — Load a Pre-trained Emotion Recognition Model
  7. STEP 4 — Predict Emotions from Images

STEP 1 — Load & Display an Image(opencv + matplotlib)

This script demonstrates how to load an image using OpenCV and display it using Matplotlib. alt text

Code

import cv2  # BGR
import matplotlib.pyplot as plt  # RGB

# Load an image (replace with your photo path) [OpenCV]
image_path = "/home/im_ane/AI_emotion_recognition/data/test_images/ana.jpg"
image = cv2.imread(image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Convert to RGB for display

# Display the image [displaying]
plt.imshow(image)  # show the image
plt.axis('off')
plt.show()  # showing the window

Libraries Used

1. cv2

  • What it is: OpenCV (Open Source Computer Vision Library) — a powerful library for real-time computer vision and image processing.

  • Why we use it: Load, manipulate, and process images (e.g., reading an image file, converting color spaces, detecting faces).

  • Key functions:

    • cv2.imread(): Reads an image from a file.
    • cv2.cvtColor(): Converts an image from one color space to another (e.g., BGR → RGB).

About the cv2 Module

  • cv2 is the Python module name that provides access to OpenCV.
  • Historically cvcv2 as OpenCV evolved; today cv2 is standard.
  • Use import cv2 to access OpenCV functions.

2. matplotlib.pyplot as plt

  • What it is: Matplotlib is a plotting library. pyplot provides a MATLAB-like plotting interface.

  • Why we use it: To display images and visualize results during development.

  • Key functions:

    • plt.imshow(): Display an image.
    • plt.axis('off'): Hide axis ticks and labels for a clean view.
    • plt.show(): Render the image window.

Key Concepts Explanation

cv2.imread(image_path)

  • Purpose: Load an image from the specified path.
  • Color space: OpenCV reads images in BGR by default.

cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

  • Purpose: Convert image from BGR → RGB.
  • Why: Matplotlib and most visualization tools expect RGB; without conversion colors appear swapped (blue/red inverted).

What is cv2.COLOR_BGR2RGB?

  • A constant (color conversion code) used by cv2.cvtColor() to specify the conversion type.

How cv2.cvtColor() works

  • cv2.cvtColor(image, cv2.COLOR_BGR2RGB) takes:

    1. The input image (image).
    2. The conversion code (cv2.COLOR_BGR2RGB).

Example

image = cv2.imread("path/to/image.jpg")  # BGR
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # RGB

plt.imshow(image)

  • Displays the image using Matplotlib.

plt.axis('off')

  • Hides axis labels and ticks for a cleaner display.

plt.show()

  • Renders the image window — without this the image may not appear.

Recap & Next Steps

What the code does

  1. Loads an image from a specified path using OpenCV.
  2. Converts the image from BGR → RGB for correct display.
  3. Displays the image with Matplotlib.

Next steps

  • STEP this script with your own images.
  • Move on to face detection using MediaPipe.
  • Implement emotion recognition with a pre-trained model.

STEP 2 — Detect Faces with MediaPipe

alt text

Code

import cv2
import matplotlib.pyplot as plt
import mediapipe as mp

# Load the image
image_path = "/home/im_ane/AI_emotion_recognition/data/test_images/ana.jpg"  # Replace with your image path
image = cv2.imread(image_path)
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Initialize MediaPipe Face Detection
mp_face_detection = mp.solutions.face_detection  # calling the tool that we want to use from MediaPipe since MediaPipe contains many tools
face_detection = mp_face_detection.FaceDetection(min_detection_confidence=0.5)  # create an object using the class FaceDetection

# Detect faces
results = face_detection.process(image_rgb)  # start processing the image to detect faces using the object we created

# Draw face detections
if results.detections:  # the list of detected faces
    for detection in results.detections:
        mp.solutions.drawing_utils.draw_detection(image_rgb, results.detections)  # draw each detected face

# If you are sure that you only have one face in your image, do this:
# mp.solutions.drawing_utils.draw_detection(image_rgb, results.detections[0])

# Display the image with face detections
plt.imshow(image_rgb)
plt.axis('off')
plt.title("Face Detection")
plt.show()

alt text


Line-by-line explanation

  1. mp_face_detection = mp.solutions.face_detection

    • Access the face detection module from MediaPipe. mp.solutions organizes MediaPipe tools (face detection, hand tracking, etc.).
  2. face_detection = mp_face_detection.FaceDetection(min_detection_confidence=0.5)

    • Create a face detection object. min_detection_confidence=0.5 sets the minimum confidence threshold (50%) to consider a detection valid.
  3. results = face_detection.process(image_rgb)

    • Run face detection on the RGB image. process() returns a results object containing detected faces (or an empty list if none).
  4. if results.detections:

    • Check whether any faces were detected. results.detections is a list of detection objects.
  5. for detection in results.detections:

    • Loop through each detected face (there may be multiple).
  6. mp.solutions.drawing_utils.draw_detection(image_rgb, detection)

    • Use MediaPipe's drawing_utils to draw bounding boxes on the image for visualization.

alt text

What Happens If We Skip a Step?

If we don’t create the face_detection object, we can’t run face detection. If we don’t call .process(), we won’t get any results. If we don’t check if results.detections:, we might try to draw boxes when there are no faces, causing an error. If we don’t loop through results.detections, we’ll only draw a box around the first face (if there are multiple faces). If we don’t use drawing_utils, we won’t see the boxes around the faces.

STEP 3 — Load a Pre-trained Emotion Recognition Model

Code

from tensorflow.keras.models import load_model

# Load the pre-trained model
model = load_model("../models/emotion_model.h5")
print("Model loaded successfully!")

Why from tensorflow.keras.models import load_model and not just import tensorflow?

  • TensorFlow includes the Keras API as tf.keras. Importing load_model from tensorflow.keras.models ensures you are using TensorFlow's integrated, optimized Keras implementation.
  • If you import tensorflow as tf, you would use tf.keras.models.load_model(). Explicit import keeps code cleaner and focuses on what you need.

What does load_model do?

  • Loads a complete model architecture + weights + training configuration from an HDF5 (.h5) or SavedModel file.
  • Reconstructs the model exactly as it was when saved, enabling inference with model.predict().

Understanding from tensorflow.keras.models import load_model

TensorFlow vs Keras Relationship

  • TensorFlow is a comprehensive machine learning framework for building and training neural networks.
  • Keras was originally a standalone high-level neural networks API, but was later integrated into TensorFlow as tf.keras.
  • When you use tensorflow.keras, you're using TensorFlow's optimized Keras API.

Why This Specific Import Path?

from tensorflow.keras.models import load_model

Reasons:

  • This is the recommended way to import Keras when using TensorFlow 2.x.
  • Ensures you're using TensorFlow's integrated and optimized implementation of Keras.
  • The hierarchy is:
TensorFlow → Keras API → models module → load_model function

What If We Just Used import tensorflow?

If you only did:

import tensorflow as tf

Then you would need to use:

tf.keras.models.load_model()

Why the explicit import is better:

  • Makes code cleaner and easier to read.
  • Follows Python's best practice: import only what you need.
  • Avoids long nested namespaces everywhere.

What Is Keras?

Keras is a high-level neural networks API that:

  • Provides a user-friendly interface for building and training models
  • Acts as a front-end for TensorFlow
  • Simplifies deep learning with intuitive APIs

Key Features of Keras:

  • Modular: Build models by stacking configurable blocks
  • User-friendly: Designed for fast experimentation
  • Extensible: Easy to add custom layers, losses, etc.

What Is models in Keras?

The models module in Keras contains:

  • Sequential class: For linear stacks of layers
  • Model class: For complex architectures using the Functional API
  • load_model function: Load a saved model
  • save_model function: Save a model

Essentially, this module provides all tools related to creating, saving, and loading models.


What Does load_model Do?

model = load_model("../models/emotion_model.h5")

It loads:

  • The model architecture (layers and connections)
  • The trained weights
  • The training configuration (optimizer, loss, metrics)
  • The state of the optimizer (if training was interrupted)

Purpose:

  • Avoid retraining from scratch
  • Use pre-trained models for inference
  • Restore models exactly as they were when saved

What’s Inside an .h5 File?

An HDF5 model file contains:

  • The entire model architecture
  • All trained weights
  • Training configuration
  • Model state for resuming training

This makes .h5 files extremely useful for saving complete Keras models.


What Is a “Loaded Model”?

When you run:

model = load_model("../models/emotion_model.h5")
print("Model loaded successfully!")

Behind the scenes:

  1. The .h5 file is read
  2. The computational graph is reconstructed
  3. All weights are loaded into memory
  4. Optimizer state is restored (if included)

What the variable model now contains:

  • A complete Keras model object
  • All layers and trained weights
  • Ready-to-use inference functions like model.predict()

Why print a success message?

  • Confirms the model was loaded without errors
  • Helps with debugging
  • Signals that inference can start

Does Keras Contain Many Models?

Keras itself does not include many pre-trained models, but it does provide:

Tools:

  • Sequential and Functional APIs
  • Common layers like Dense, Conv2D, LSTM
  • Loss functions, optimizers, and metrics

Sources of pre-trained models:

  • TensorFlow Hub
  • Keras Applications (e.g., VGG, ResNet, MobileNet)
  • Research paper implementations
  • Your own trained models (like emotion_model.h5)

Common Keras Model Types:

  • Sequential models: Simple linear stacks
  • Functional API models: Complex, multi-branch architectures
  • Subclassed models: Fully custom models using Python class inheritance

alt text

STEP 4 — Predict Emotions from Images(Tenserflow and keras)

Why Use TensorFlow and Keras?

TensorFlow:

What it is: An open-source machine learning framework developed by Google for building and training neural networks. Why we use it: TensorFlow provides a comprehensive ecosystem for machine learning, including tools for building models, training them, and deploying them in production. What it provides: TensorFlow is the backbone—it provides the low-level infrastructure for building, training, and running machine learning models. This includes:

Core libraries for defining and executing computational graphs. GPU/CPU optimization and hardware acceleration. Tools for distributed training and deployment.

Analogy: Think of TensorFlow as the engine of a car. It handles all the complex mechanics under the hood.

Keras:

What it is: A high-level neural networks API, written in Python and capable of running on top of TensorFlow. Why we use it: Keras simplifies the process of building and training deep learning models. It provides a user-friendly interface that makes it easier to define models, add layers, and compile models for training.

What it provides: Keras is a high-level API built on top of TensorFlow. It provides user-friendly tools to:

Define, train, and evaluate neural networks with minimal code. Load and save models (like your emotion_model.h5). Preprocess data and make predictions easily.

Analogy: Keras is like the steering wheel and dashboard of the car. It makes driving (using machine learning) much easier without needing to understand the engine's inner workings.

Pre-trained Models:

Does Keras contain pre-trained models?: Keras itself does not come with pre-trained models for specific tasks like emotion recognition. However, it provides tools to load and use pre-trained models. Where to find pre-trained models?: You can find pre-trained models on platforms like GitHub, Kaggle, or TensorFlow Hub. For emotion recognition, you might need to download a pre-trained model file (like emotion_model.h5) from one of these sources or train your own model.

After loading the model (STEP 3), typical steps:

  1. Preprocess images (resize, normalize, convert to grayscale if required).
  2. Pass images through the model with model.predict().
  3. Interpret output probabilities and map them to emotion labels (e.g., happy, sad, etc.).

The model is the "brain"; MediaPipe is the "eyes" that find faces.

Code

import cv2
import numpy as np
from tensorflow.keras.models import load_model

# Load the model
model = load_model("../models/emotion_model.h5")

# Load and preprocess the image
image_path = "../data/test_images/your_photo.jpg"  # Replace with your image path
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)  # Load as grayscale
image = cv2.resize(image, (48, 48))  # Resize to 48x48 (common input size for emotion models)
image = np.expand_dims(image, axis=0)  # Add batch dimension
image = np.expand_dims(image, axis=-1)  # Add channel dimension

# Predict emotion
emotion_prediction = model.predict(image)
emotion_label = np.argmax(emotion_prediction)
emotion_labels = ["Angry", "Disgust", "Fear", "Happy", "Sad", "Surprise", "Neutral"]
print(f"Predicted Emotion: {emotion_labels[emotion_label]}")
import cv2
import numpy as np
from tensorflow.keras.models import load_model

# Load the model
model = load_model("/home/im_ane/AI_emotion_recognition/models/emotion_model.h5")
print("✅ Model loaded successfully!")

# Load and preprocess the image
image_path = "/home/im_ane/AI_emotion_recognition/data/test_images/ana.jpg"
image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

if image is None:
    print("❌ Error: Could not load image!")
    exit()

print(f"📐 Original image size: {image.shape}")

# Resize to 64x64 (what the model expects)
image = cv2.resize(image, (64, 64))
print(f"📐 Resized image size: {image.shape}")

# Normalize pixel values
image = image.astype('float32') / 255.0

# Add dimensions for model input
image = np.expand_dims(image, axis=0)    # Add batch dimension
image = np.expand_dims(image, axis=-1)   # Add channel dimension

print(f"🎯 Final input shape: {image.shape}")

# Predict emotion
print("🧠 Making prediction...")
emotion_prediction = model.predict(image)
emotion_label = np.argmax(emotion_prediction)
confidence = np.max(emotion_prediction)

emotion_labels = ["Angry", "Disgust", "Fear", "Happy", "Sad", "Surprise", "Neutral"]

print("\n🎭 EMOTION RECOGNITION RESULT:")
print("=" * 40)
print(f"✅ Predicted Emotion: {emotion_labels[emotion_label]}")
print(f"📊 Confidence: {confidence:.2%}")

print("\n📈 All probabilities:")
for i, emotion in enumerate(emotion_labels):
    prob = emotion_prediction[0][i]
    print(f"  {emotion:9}: {prob:.4f} ({prob:.1%})")
    ```
What does .h5 mean?
.h5 files are Hierarchical Data Format files - they're used to store:

Model architecture (layer configurations)

Model weights (learned parameters)

Training configuration (optimizer, loss function)

Optimizer state (for continuing training)

Think of it as a complete saved model package that you can load and use immediately without retraining.
---

## Notes & Tips

* Verify your model input shape (use `model.summary()` to check). Common formats:

  * `48×48 grayscale` with shape `(1, 48, 48, 1)` (FER2013-style models).
  * `224×224 RGB` for transfer-learning models (e.g., MobileNet).
* Normalize pixel values if the model expects `0–1` inputs: `image = image / 255.0`.
* Use MediaPipe to crop face regions first, then pass the face patch to the emotion model.

---

## Final Recap

This README covers:

* Loading and displaying images (OpenCV + Matplotlib).
* Using MediaPipe to detect faces.
* Loading a saved Keras model.
* An end-to-end predict example for emotion classification.

## STEP 5 : using open cv + haarcascade_frontalface  to draw a rectangle on the face
![alt text](Readme_images/image4.png)
```python
import cv2
import os

# Load cascade with full path
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

# Check if cascade loaded
if face_cascade.empty():
    print("Error loading cascade classifier")
    exit()

# Check if image exists
if not os.path.exists('/home/im_ane/AI_emotion_recognition/data/test_images/ana.jpg'):
    print("Image file not found!")
    exit()

# Read image
img = cv2.imread('/home/im_ane/AI_emotion_recognition/data/test_images/ana.jpg')
if img is None:
    print("Error reading image")
    exit()

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
print(f"Found {len(faces)} face(s)")

# Draw rectangles
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)

# Display
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

STEP 5 - Open your webcam using OpenCV

import cv2

# Initialize webcam (index 0)
cap = cv2.VideoCapture(0)

# Check if camera opened successfully
if not cap.isOpened():
    print("Error: Could not open camera.")
    exit()

while True:
    ret, frame = cap.read()

    if not ret:
        print("Error: Failed to grab frame.")
        break

    cv2.imshow("Webcam Feed", frame)

    # Exit on 'q' key press
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Cleanup
cap.release()
cv2.destroyAllWindows()

Detailed Explanation:

import cv2:

Imports the OpenCV (Open Source Computer Vision Library) module, which provides tools for real-time computer vision.

cap = cv2.VideoCapture(0):

Creates a video capture object to access the webcam. 0 is the default camera index. If you have multiple cameras, you can try 1, 2, etc.

if not cap.isOpened()::

Checks if the camera opened successfully. If not, it prints an error message and exits.

while True::

Starts an infinite loop to continuously capture and display frames.

ret, frame = cap.read():

cap.read() reads a frame from the video capture. ret is a boolean indicating if the frame was read successfully. frame contains the actual image data (a NumPy array).

if not ret::

Checks if the frame was read successfully. If not, it prints an error message and breaks the loop.

cv2.imshow("Webcam Feed", frame):

Displays the frame in a window titled "Webcam Feed".

if cv2.waitKey(1) & 0xFF == ord('q')::

cv2.waitKey(1) waits for 1 millisecond for a keyboard event. & 0xFF is a bitwise operation to handle different OS key representations. ord('q') gets the ASCII value of 'q'. If 'q' is pressed, the loop breaks and the program exits.

cap.release():

Releases the video capture object and frees resources.

cv2.destroyAllWindows():

Closes all OpenCV windows.

STEP 6 - Face Detection with MediaPipe

# Initialize MediaPipe Face Detection
mp_face_detection = mp.solutions.face_detection
face_detection = mp_face_detection.FaceDetection(min_detection_confidence=0.5)
 # Process the frame to detect faces
    results = face_detection.process(rgb_frame)

DeepFace

DeepFace is a lightweight face recognition and facial attribute analysis framework for Python. It's built on top of popular deep learning frameworks like TensorFlow and Keras, and it uses OpenCV for image processing. Key Features:

Face Detection: Uses MTCNN, OpenCV's Haar cascades, Dlib, or SSD to detect faces in images. Face Recognition: Uses FaceNet, VGGFace, OpenFace, or DeepID to recognize faces. Facial Attribute Analysis: Can detect emotions, age, gender, and racial ethnicity. Easy-to-Use API: Provides simple functions for complex tasks. How DeepFace Works Internally:

Face Detection:

Uses OpenCV's Haar cascades by default (but can use others). Detects faces in an image and extracts face regions.

Face Alignment:

Aligns detected faces to a standard format.

Feature Extraction:

Uses deep learning models to extract facial features.

Analysis:

For emotion analysis, it uses a pre-trained CNN model. The model outputs probabilities for different emotions.

Does DeepFace Contain OpenCV and TensorFlow? Yes, DeepFace is built on top of:

OpenCV: For image processing and face detection. TensorFlow/Keras: For deep learning models. Other libraries: Such as NumPy, Pandas, etc. When you install DeepFace, it automatically installs these dependencies.

Recommendation

Use DeepFace if:

You want a quick and easy solution. You don't need real-time performance. You want multiple features (emotion, age, gender) in one package.

Use MediaPipe + TensorFlow if:

You need real-time performance. You want more control over the process. You're building a custom solution or need to integrate with other systems.

'Face Detection <==>media pipe'

'Emotion Recognition <==> tenser flow'

Tenserflow ??? adds what ??

TensorFlow adds emotion recognition capabilities to the face detection pipeline: 1-Model Loading:Loads a pre-trained neural network model for emotion classification

model = load_model("/home/im_ane/AI_emotion_recognition/models/emotion_model.h5")

2-Emotion Labels:Defines the possible emotion categories

emotion_labels = ["Angry", "Disgust", "Fear", "Happy", "Sad", "Surprise", "Neutral"]

3-Face Preprocessing: ->Converts the face region to grayscale ->Resizes to match the model's expected input (64x64) ->Normalizes pixel values to [0, 1] range

face_roi_gray = cv2.cvtColor(face_roi, cv2.COLOR_BGR2GRAY)
face_roi_processed = cv2.resize(face_roi_gray, (64, 64))
face_roi_processed = np.expand_dims(face_roi_processed, axis=0)
face_roi_processed = np.expand_dims(face_roi_processed, axis=-1)
face_roi_processed = face_roi_processed.astype('float32') / 255.0

4-Emotion Prediction: ->Uses the model to predict emotion probabilities ->Gets the most likely emotion and its confidence score

emotion_prediction = model.predict(face_roi_processed)
emotion_label = np.argmax(emotion_prediction)
confidence = emotion_prediction[0][emotion_label]

5-Visualization: Displays the predicted emotion and confidence on the video frame

cv2.putText(frame, f"{emotion_text} ({confidence:.2f})", (x, y - 10),
           cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2)

Haar Cascades ???

Haar Cascades are object detection algorithms used to identify faces in images or video. They work by:

1-Training: Using positive (face) and negative (non-face) images to create a cascade of classifiers. 2-Detection: Sliding a window across the image and applying the cascade of classifiers to detect faces. alt text alt text alt text

The Complete Pipeline:

  • Capture video frame

  • Convert to grayscale

  • Haar Cascade detects faces For each face:

  • Extract face region

  • Preprocess (resize, normalize)

  • H5 Model predicts emotion

  • Draw rectangle and emotion label

🥳🥳🥳haaappyyyyyy🥳🥳🥳

alt text

About

detecting emotions from your webcam or any image using deepface

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages