Training guidelines / does and don'ts? Getting some very mixed results #1011

bagobones · 2023-01-12T02:15:56Z

bagobones
Jan 12, 2023

I am currently using CompreFace via Double Take and Frigate.. not that my question is specific to that but I am running into issues that seem to be related to both the source image being tested and the source images for training.

I looked at the few discussions on training and I am starting to think the documentation needs to explicitly state what resolutions make a good training images, example poses etc as well as don'ts.. Below are some of the challenges I have been finding and a few would things probably could have been avoided with some form of guide on training.

Garbage in equals garbage out:

While this may be intuitive and Double Take has some tools for this I have found that adding even a single lower res image or at least the face area is very small in total pixels or one with some artificing like pulling from surveillance footage seems to cause a drastic skew in results.. IE low res training images cause high confidence matches on almost all faces. I wonder if ComreFace itself should just cut off any face below a specific pixel count from training to being with as protection against this.

At this point I am basically avoiding training from any frigate source data unless the subject is filling the frame, free of artefacts and well lit.

Guidance on what makes a good training image:

Surveillance images as a source will catch people at all angles from a high angle above, in profile etc. Initially I was training with mostly dead on images using my higher quality phone, then I started adding up, down, left, right looking photos.. But as I try to improve detection I am now sometimes getting ears detected (face box is just the ear) as faces with high confidence matches ETC..

Is there guidance on how training images should be taken?

Family resemblance is a very small % certainty difference some times:

From a distance CompreFace seems to be all over the map in terms of separating our family of 5 from each other.. Where I am about the only one nearly always correct. I frequently get up to 98% confident wrong matches this is with high quality training images but low quality surveillance images at a distance sometimes, I have bumped up the min area in Double Take which cuts these down a lot, but even then I still get a lot of bad ones. Most seem to happen when the still uses has smear or artifacting from capturing a subject moving fast or in poor lite. I wonder if there is any way CompreFace it self could filter for "poor detail" or noisy images? Most of the time an unknown result would be better than a match.

Over all it is interesting that particularly with my kids that I am getting matching between 90-98% on the low smear or blurry images instead of a low confidence match.

Radom NOT A FACE things seem to turn up more since I have been adding training images:

patterns on clothing
just an ear in the face box
a bag of bread on the counter

I have det_prob_threshold set to 0.85 the default was 0.8 however I am thinking of setting it higher as it likes to find faces in video artifacts, most of the time those come back as unknown thankfully and since I stopped using the surveillance video as a training source.
edit: at .85 I still had a cup of coffee detected, but looks like pushing to .9 got rid of that.. much like some of the other values I am surprised I have to push so close to 100 essentially to get rid of some of these false positives.

I have looked at the documentation, and if I missed anything on guidelines for good training please let me know.

Also for reference most of my security cameras are currently lower end 1080p units and most problems happen "at a distance" but not always.. sometime is is just a matter of someone looking down when or in profile that causes a high confidence false positive.

I was playing around with deep stack as well and noticed confidence levels had much wider ranges even after training, however over all I have found CompreFace more accurate.

pospielov · 2023-01-23T17:32:29Z

pospielov
Jan 23, 2023
Maintainer

I do agree with everything you wrote and see your problems
We don't have guidance on how training images should be taken, but they can be googled because they are the same for every tool.
And I'm not the best expert to suggest it. We mostly focused on wrapping open-source facial recognition libraries in a more usable format.
So basically, you can google the guidance for FaceNet or InsightFace libraries and apply them for CompreFace.

Regarding low resolution and blurred faces - we have a story in the backlog to add a blur plugin that will determine the level of blur.

3 replies

bagobones Jan 23, 2023
Author

It might be worth putting links to some of those guides in the ComreFace documentation then? While I consider my self a highly technical user I just didn't dig enough to find out what was under the hood in ComreFace to then try and find appropriate guides, suggestion some guides relevant to the underlying tech would be great. From my point of view I am actually 3 stacks down from my starting point already Frigate-->Doubletake-->Comreface .

Some of my other suggestions probably could be turned into safe defaults / tunable settings for users to help prevent bad outcomes.

have a minimum pixel area for detection (Double Take has a setting for this but CompreFace probably should just not accept things too small (toss a nice error saying insufficient pixels for accurate detection or something)
Same for training (minimum pixel area).. This might be a bit more ambiguous but at the same time the wildest problems I have been having are adding training images that are overly pixelated. We know this system is for face detection, it should be possible to set some rails on it for better results. (Edit of note I see at least one face net guide suggesting 160×160px which is much larger than anything my security cameras are picking up other than my door bell lol)
Tuning det_prob_threshold up around 0.97 or 0.98 now this has cut false positives A LOT but also reduced detections and is very helpful, thus having some minimums in both training and detection would also be nice.. For double take / Frigate I would in many cases prefer no match over a bad match that is then filtered by doubletake or accidently training an image with far too few pixels. I am sure this would apply to other ComreFace use cases.

Out of scope but a wild thought I had when reviewing how bad many of the source images where from my security cameras, in addition to blur detection I wonder if AI upscaling would improve results, make things worse, or just be too CPU intense?

End of the day I think I might just have to wait and upgrade to high pixel count / higher quality (less blur) cameras to consider facial recognition.

pospielov Jan 27, 2023
Maintainer

Just googled and found several good articles related to it:
https://docs.aws.amazon.com/rekognition/latest/dg/recommendations-facial-input-images.html
https://www.neurotechnology.com/verilook-technical-specifications.html
https://www.kairos.com/docs/api/best-practices
It looks like all those articles are from our competitors, so you are right, I think we need to write a similar article.

Out of scope but a wild thought I had when reviewing how bad many of the source images where from my security cameras, in addition to blur detection I wonder if AI upscaling would improve results, make things worse, or just be too CPU intense?

That is a tricky question.
I don't think it will work.
Imagine there are two high-quality photos of two very similar people.
And you resize it to receive low-quality photos. During this process, you lose some information that can be used by the facial recognition system.
If you apply AI upscaling, the "AI upscaling system" will take this reduced information and add some information. It won't restore the original information but add some own information.
Then facial recognition system will take this new extended information and will try to use it to recognize the face. The question is, if "AI upscaling system" can understand face better than a facial recognition system.
So I think it will work only if "AI upscaling system" is more powerful (and probably recourse-consuming) than the facial recognition system.
Finally, I think it's easier just to take a more powerful facial recognition model, than using "AI upscaling system".
But this is just my guess.

bagobones Jan 27, 2023
Author

Thanks for the response.. and yes after your recommendation I was able to google a few but only after changing my search words heavily .. It also re-enforced my suggestion that there probably should be a cut off in Comreface on the lowest pixel count it will train on period.. using doubletake it does reject some training images but not many.

I still haven't solved differentiation between family members.. I have about 20+ training images per family member but needing to get new high quality images that follow the 45 degree rules etc etc is actually hard lol.

I am currently contemplating upgrading my doorbell camera as I think it is the only camera located at a good training distance / angle if it had higher resolution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training guidelines / does and don'ts? Getting some very mixed results #1011

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Training guidelines / does and don'ts? Getting some very mixed results #1011

bagobones Jan 12, 2023

Replies: 1 comment · 3 replies

pospielov Jan 23, 2023 Maintainer

bagobones Jan 23, 2023 Author

pospielov Jan 27, 2023 Maintainer

bagobones Jan 27, 2023 Author

bagobones
Jan 12, 2023

Replies: 1 comment 3 replies

pospielov
Jan 23, 2023
Maintainer

bagobones Jan 23, 2023
Author

pospielov Jan 27, 2023
Maintainer

bagobones Jan 27, 2023
Author