Recognizing actors/celebs in a clip or image from any media, using DeepLearning with Python. Can use either CNN or HOG for face detection and then compare the face with our dataset of faces.
I have used the mostly comprehensive dataset available here. It is only updated with celebrity faces till 2021, so we might need to update it further if required.
Tons of help from ageitgey's face_recognition library.
Inspired by this wonderful article.
-
Install
cmake, as it is required for the dlib library. For linux, runsudo apt-get install cmakeFor Windows, download the installer from here For macOS, runbrew install cmakeAlso, install
pipenvfor managing the virtual environment. For linux, runsudo pip install pipenvFor Windows, runpip install pipenvFor macOS, runbrew install pipenvThen, run
pipenv shellto activate the virtual environment. Finally, runpipenv installto install all the dependencies.Note: the
face-recognitionlibrary does not officially support Windows, but it still might work, as it says in its README -
The dataset has the following structure.
For my implementation, each actor has 25 images. More will do better, but this number seems to work fine.
-
For every image in the dataset, we first get a square enclosing the face in the image, then generate a 128d vector for that face, which is dumped to the 'encodings.pickle' file.
We can either use CNN(slower, more accurate) or HOG(faster, less accurate) for the face detection process. Here I've used the face_recognition library, which gives me both the options.
For a big dataset, techniques like MapReduce or Spark can be used to parallelize the process over a cluster of machines.
Moreover, use the
-fnnflag in case you want to use the KDTree method for searching, which is much faster than the linear search. -
Consider an image, be it a still from the movie, or a frame of a video clip. First, we identify the faces in the image using the same method as above (CNN or HOG), generate an encoding for it(128d vector), and then compare it with our collected encodings. The actors with the most matched encodings is the actor in the image.
This search can either be linear, or using a KDTree. I've used the KDTree method, which is much faster. This can be done by passing the
-fnnflag to the python file.
Read the first few lines of the Python file involved to understand the parameters used in each case
-
python faceEncode.py --dataset dataset/actors --encodings encodings/encodings.pickle -d hog -c 8-cflag is the number of cores to use for parallel processing.Can also use the
-fnnflag to later use the KDTree method for searching. -
python faceRecImage.py -e encodings.pickle -i examples/ex6.png -d hog -o out/Use the
-fnnflag to use the KDTree method for searching. -
python faceRecVideoFile.py -e encodings/encodings.pickle -i input_vids/ex2.mp4 -o output_vids/ex2.avi -y 0 -d hogOutputs a video with the faces marked.

