make image similarity check less sensitive#258
Closed
jbitton wants to merge 1 commit intofacebookresearch:mainfrom
Closed
make image similarity check less sensitive#258jbitton wants to merge 1 commit intofacebookresearch:mainfrom
jbitton wants to merge 1 commit intofacebookresearch:mainfrom
Conversation
Contributor
|
This pull request was exported from Phabricator. Differential Revision: D70137163 |
Contributor
|
This pull request was exported from Phabricator. Differential Revision: D70137163 |
jbitton
added a commit
to jbitton/AugLy
that referenced
this pull request
Feb 25, 2025
Summary: Pull Request resolved: facebookresearch#258 as part of my personal side quest to make augly's tests pass again, i am making a change to our tests. currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays. thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash. we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now. Differential Revision: D70137163
66eb525 to
7778f95
Compare
jbitton
added a commit
to jbitton/AugLy
that referenced
this pull request
Feb 26, 2025
Summary: as part of my personal side quest to make augly's tests pass again, i am making a change to our tests. currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays. thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash. we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now. Differential Revision: D70137163
7778f95 to
e746ff9
Compare
Contributor
|
This pull request was exported from Phabricator. Differential Revision: D70137163 |
facebookresearch#258) Summary: as part of my personal side quest to make augly's tests pass again, i am making a change to our tests. ## overlay wrap text fix seems like we were modifying the original image in place for overlay wrap text which is not the augly way (we always copy + modify + return new image), so this change fixes that and also fixes 99% of the image tests. ## imagehash change currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays. thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash. we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now. Reviewed By: joelicohk, mayaliliya Differential Revision: D70137163
e746ff9 to
f2b5f6c
Compare
Contributor
|
This pull request was exported from Phabricator. Differential Revision: D70137163 |
Contributor
|
This pull request has been merged in 0bfda4c. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.
currently, to assess image similarity, we use the
np.allclosefunction. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.
we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html
phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.
Differential Revision: D70137163