Signclip metric #5

cleong110 · 2024-11-25T16:03:12Z

No description provided.

…ly but I can load in the .npy files

…g metric

AmitMY

left initial comments

pose_evaluation/metrics/signclip_distance_metric.py

pose_evaluation/metrics/embedding_distance_metric.py

pose_evaluation/metrics/signclip_distance_metric.py

pose_evaluation/metrics/embedding_distance_metric.py

cleong110 · 2024-12-04T16:06:24Z

Having had a chance to read https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/util.py#L31, I am halfway tempted to either copy large chunks wholesale, or add it as a dependency and use theirs.

I'm going to at least cross-check using this.

AmitMY · 2024-12-04T17:13:28Z

Having had a chance to read https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/util.py#L31, I am halfway tempted to either copy large chunks wholesale, or add it as a dependency and use theirs.

I'm going to at least cross-check using this.

If you choose to use their stuff, please add it as a dependency. Do not just copy paste code :) It would mean we have less code here, less code to maintain, and we are using a robust implementation

cleong110 · 2024-12-04T19:21:52Z

Added sentence-transformers as a dependency, and reworked to use it. Also updated tests, fixed various pylint things. Going to try running pytest again on another machine, see if the outputs look right.

cleong110 · 2024-12-04T19:28:15Z

OK, pytests run right on the other machine, other than the test_score_different_length from test_distance_metric.py.

Pylint mostly likes it, though it doesn't appreciate me using the protected functions from sentence-transformers.

cleong110 · 2024-12-04T19:32:12Z

Code style suggestions from OpenAI: https://chatgpt.com/share/6750ae24-9b80-800e-82a7-ac888a21a615

I like the following:

pose_evaluation/metrics/embedding_distance_metric.py

pyproject.toml

pose_evaluation/metrics/embedding_distance_metric.py

cleong110 · 2024-12-05T19:01:54Z

All right, ready for the next round, I think. Today I:

added some checks for list of ndarray, list of Tensor
current version uses a dispatch dictionary to automatically pick the correct function and call it, rather than a long string if if/elif
Added a few more tests to deal with List inputs

pose_evaluation/metrics/.gitignore

AmitMY · 2024-12-05T20:49:54Z

pose_evaluation/evaluation/evaluate_signclip.py

+
+
+if __name__ == "__main__":
+    main()


did not look at this file yet.

@AmitMY I would appreciate some thoughts on that one and where to go with it. At the moment it basically just does ASL Citizen, I wanted to extend it to others such as SemLex

so, what this file does, is get distances specifically signclip for asl_citizen?
that is a start, but, it would be best if what we could do is:

Given a directory of poses in various classes, for example poses/class/X, can iterate over all of the metrics, and run them to calculate the k-nearest neighbor for each sample, for classification.
Then, once we have like 8 metrics, we can run them all, and see which one gives the best classification score (and that would be considered the best metric, for form based comparison, for single signs) - see https://github.com/sign-language-processing/signwriting-evaluation/blob/main/signwriting_evaluation/evaluation/closest_matches.py#L93-L107

Another example, would be to have a directory of poses poses-dgs/ where each pose has a .txt file associated with it. Let's assume 1000 sentences in german sign language and german.
Then, we can perform an all-to-all similarity between the poses, and an all-to-all similarity between the texts (using xCOMET for example) and perform a correlation study. whichever metric correlates best with xCOMET is the best metric for semantic sentence comparison.

What I am trying to say is: we develop a generic evaluation that tries to say something "for 1 dataset type, run all metrics, correlate with something"

and then we can perform this on many datasets and inform the reader about the best metrics.

Then, when someone comes and says "i developed a new metric" they run it on everything, like GLUE basically, and we can see the upsides and downsides.

Oh yes, I quite like the second idea, which corresponds to a recent reading:

https://phillipi.github.io/prh/

This way, even if the pose and text do not share the same latent space (unlike in SignCLIP), we can still use a similarity kernel to calculate the correlation. The alignment to text offers a generic way of semantic comparison, while the one I am working on with the human study is tailored for machine translation.

So in the end, we can evaluate a metric either given (1) a bunch of grouped poses or (2) a bunch of poses paired with text.

(Let's take it step by step, do not need to have them all in this PR!)

pose_evaluation/metrics/embedding_distance_metric.py

… dtype arg, rename set_device, etc

AmitMY

next, i will look into evaluate_signclip - doesn't have to be the same PR, that's why i reviewed in the order i did

pose_evaluation/metrics/embedding_distance_metric.py

pose_evaluation/metrics/.gitignore

pose_evaluation/metrics/conftest.py

pose_evaluation/metrics/embedding_distance_metric.py

pose_evaluation/metrics/test_embedding_distance_metric.py

…ance_matrix_shape_checker, removing redundant args documentation,

AmitMY

I am mostly happy with this PR. It can still be modified in the future, once we have more metrics etc. Thanks a lot for all the hard work!

cleong110 added 14 commits November 14, 2024 16:50

CDL: initial pass at a signclip-based metric, I cannot embed on the f…

0ee64bc

…ly but I can load in the .npy files

Merge branch 'sign-language-processing:main' into signclip_metric

f5ec72e

initial attempt at an evaluation script

e3241fb

initial attempt at pytest for signclip metric

41f75ed

SignClip distances are just embedding distances. Make a base embeddin…

da881d5

…g metric

CDL: Got some pytest tests running!

a89aab8

some updates to evaluate signclip script

83f9153

CDL: messing around with in-class and out-of-class means

8680048

CDL: testing out in/out of class mean distance

a6b22c3

CDL: trying to batch-process calculation of means

b893496

CDL: saving off the class means

20bcba2

a bit of code cleanup

d15b923

some code cleanup

5f3b1ba

Fixed a few pytests

00ec4b8

AmitMY requested changes Nov 26, 2024

View reviewed changes

Remove unneeded SignCLIP file

03066be

cleong110 mentioned this pull request Dec 4, 2024

Metric Taxonomy #6

Open

cleong110 added 3 commits December 4, 2024 13:49

Use sentence-transformers utils for embedding distances

4a1b9f2

CDL: updating the tests a bit

12f612c

Various pylint changes

3ca874e

cleong110 marked this pull request as ready for review December 4, 2024 19:26

Various stylistic and commenting changes

d7fb10e

AmitMY requested changes Dec 4, 2024

View reviewed changes

cleong110 added 2 commits December 5, 2024 13:19

Better handling of List to tensor conversions

884deb9

Adding some tests, including for List handling

4934c5d

CDL: a few pylint changes

a495c67

AmitMY requested changes Dec 5, 2024

View reviewed changes

CDL: some requested changes. Remove redundant variable, remove unused…

0e54bf9

… dtype arg, rename set_device, etc

AmitMY requested changes Dec 6, 2024

View reviewed changes

cleong110 added 5 commits December 6, 2024 11:51

Add distance_matrix shape checker fixture

cb45301

Various pull request changes including an ndim assertion, use of dist…

000e346

…ance_matrix_shape_checker, removing redundant args documentation,

CDL: change test_artifacts_dir name to 'temp'

dca700b

Update gitignore

d9eb0b2

Took out one more redundant 'args' comment

73ebd75

AmitMY approved these changes Dec 10, 2024

View reviewed changes

AmitMY merged commit d5b7dc2 into sign-language-processing:main Dec 10, 2024
0 of 2 checks passed

This was referenced Dec 11, 2024

Implement SignCLIP Distances metric #1

Open

Generic evaluation script: for a dataset, run all metric #7

Open

Signclip metric #5

Signclip metric #5

Uh oh!

Conversation

cleong110 commented Nov 25, 2024

Uh oh!

AmitMY left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cleong110 commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AmitMY commented Dec 4, 2024

Uh oh!

cleong110 commented Dec 4, 2024

Uh oh!

cleong110 commented Dec 4, 2024

Uh oh!

cleong110 commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cleong110 commented Dec 5, 2024

Uh oh!

Uh oh!

AmitMY Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

cleong110 Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

AmitMY Dec 6, 2024

Choose a reason for hiding this comment

Uh oh!

J22Melody Dec 7, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AmitMY left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AmitMY left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

cleong110 commented Dec 4, 2024 •

edited

Loading

cleong110 commented Dec 4, 2024 •

edited

Loading