Computer vision project using the TF2 object detection models to make real-time classification of objects from a webcam. In my case I trained a model to identify if I was wearing sunglasses or not. Video below.
Model used was SSD MobileNet V2 FPNLite 320x320 downloaded from https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md.
This model has a speed of 22ms and 22.2 COCO mAP. Speed over precision was chosen since we're using it for real-time object detection
- Create a conda environment with
conda create --name [env_name] -r [requirements.txt]. Download here the requirements.txt - Enable CUDA and CuDNN for faster training
- Produce at least 40-50 images. Half of them wearing something (such as sunglasses, face masks, hats) and the other half removing it.
- Open /Tensorflow/labelImg/labelImg.py with
python /Tensorflow/labelImg/labelImg.py - Once the program opens, click on the "Open dir" button and selected the folder where all the images producted are located
- Click on "Change Save dir" and make sure it's the same directory as "Open dir"
- Click on View -> Auto Save Mode
- Press "W" and draw a label on top of the area where the object needs to be detected
- Name the label, accordignly. In my case, naming it "Glasses" if the glasses are on, or "NoGlasses" if they aren't
- Go to the next image pressing "D" and repat point 7 until all images are marked with labels
- Close the application
- Select the couples of images and labels produced as xml files. Move approx 90% of them in
\Tensorflow\workspace\images\trainfolder and the remaining 10% in the\Tensorflow\workspace\images\testfolder. Making sure there is a good mix of "with and without object" images in both folders
- Open
gen_annotation.pyin the root of the repo. Replace in line 14 "Glasses" and "NoGlasses" with the names of your labels. Save it. - Run
gen_annotation.pywithpython gen_annotation.py - Run
update_config.pywithpython update_config.py - Train the model with
python Tensorflow/models/research/object_detection/model_main_tf2.py --model_dir=Tensorflow/workspace/models/my_ssd_mobnet --pipeline_config_path=Tensorflow/workspace/models/my_ssd_mobnet/pipeline.config --num_train_steps=10000where in num_train_steps you can choose how many timesteps train the model. In the end if your model's loss is below 0.150 then it's good enough.
- Go to the
Tensorflow/workspace/models/my_ssd_mobnet/folder. Check which one is the last checkpoint saved. For exampleckpt-10.index - Open
detect_real_time.pyand replace line 28ckpt.restore(os.path.join(CHECKPOINT_PATH, 'ckpt-11')).expect_partial()withckpt.restore(os.path.join(CHECKPOINT_PATH, 'ckpt-10')).expect_partial(). Save the file. - Run
detect_real_time.pywithpython detect_real_time.py. Wait for the webcam to open. And enjoy! - To close press "q"
- To run the model I've trained with images of myself with sunglasses (it's probably not going to work with you), use model with ckpt-11 already present in Tensorflow/workspace/models/my_ssd_mobnet/
