Skip to content

make image similarity check less sensitive#258

Closed
jbitton wants to merge 1 commit intofacebookresearch:mainfrom
jbitton:export-D70137163
Closed

make image similarity check less sensitive#258
jbitton wants to merge 1 commit intofacebookresearch:mainfrom
jbitton:export-D70137163

Conversation

@jbitton
Copy link
Contributor

@jbitton jbitton commented Feb 25, 2025

Summary:
as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

currently, to assess image similarity, we use the np.allclose function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Differential Revision: D70137163

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70137163

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70137163

jbitton added a commit to jbitton/AugLy that referenced this pull request Feb 25, 2025
Summary:
Pull Request resolved: facebookresearch#258

as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Differential Revision: D70137163
jbitton added a commit to jbitton/AugLy that referenced this pull request Feb 26, 2025
Summary:

as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Differential Revision: D70137163
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70137163

facebookresearch#258)

Summary:

as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

## overlay wrap text fix

seems like we were modifying the original image in place for overlay wrap text which is not the augly way (we always copy + modify + return new image), so this change fixes that and also fixes 99% of the image tests.

## imagehash change

currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Reviewed By: joelicohk, mayaliliya

Differential Revision: D70137163
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70137163

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 0bfda4c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants