Skip to content

Code from pose-eval paper #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 158 commits into
base: main
Choose a base branch
from

Conversation

cleong110
Copy link
Contributor

@cleong110 cleong110 commented Apr 16, 2025

Code to:

  1. Collect multiple datasets together to a common DataFrame/CSV format, with GLOSS, POSE_FILE_PATH, SPLIT and unique VIDEO_ID columns
  2. Parser scripts for ASL Citizen, Sem-Lex, and PopSign ASL to the common csv format
  3. Load in splits all three datasets, e.g. all the train/val or just the test sets
  4. Construct metrics automatically by generating combinations of Distance Measure, keypoint selection, sequence alignment, etc. Resulting in dozens of metrics
  5. Run "in-Gloss+4x Outgloss"
  6. Save the results to specified folder as csvs.

Example usage:

clone and setup

# clone the repo, and checkout this branch
# cd into the repo
conda create -n pose_eval_src pip
conda activate pose_eval_src
# which pip should show the pip inside the env
which pip

# install editable
pip install -e -U .

Then generate csv files

python pose_evaluation/evaluation/dataset_parsing/popsign_to_df.py ~/data/PopSignASL/ --out ~/projects/pose-evaluation/dataset_dfs/popsign_asl.csv
python pose_evaluation/evaluation/dataset_parsing/sem_lex_to_dataframe.py ~/data/Sem-Lex/ --out dataset_dfs/semlex.csv
python pose_evaluation/evaluation/dataset_parsing/asl_citizen_to_dataframe.py ~/data/ASL_Citizen/ --pose-files-path ~/data/ASL_Citizen/poses/ --metadata-path ~/data/ASL_Citizen/splits/ --out dataset_dfs/asl-citizen.csv 

Note that the popsign ASL one can optionally "translate" some but not all of the glosses if given a path to the ASL Knowledge graph --asl-knowledge-graph-path, see #28

python pose_evaluation/evaluation/dataset_parsing/popsign_to_df.py ~/data/PopSignASL/ --out dataset_dfs/popsign_asl.csv --asl-knowledge-graph-path ~/data/ASLKG/edges_v2_noweights.tsv

Then load them and run metrics

python pose_evaluation/evaluation/load_splits_and_run_metrics.py dataset_dfs/*.csv

# usage instructions
python pose_evaluation/evaluation/load_splits_and_run_metrics.py --help

analysis (TODO)

Script that will load all the score csv files and run analysis

@cleong110
Copy link
Contributor Author

Something else I'm realizing as I construct metrics is that it would be nice if some of them automatically populated their own Preprocessors based on the DistanceMeasure. Like, DTW metrics do not need a Sequence Alignment processor such as ZeroPadShorter

Similarly, dtai distance needs a strategy for dealing with nan (masked) values. Or we get nan trajectory distances, which become nan distances when aggregated. So when one instantiates a metric one NEEDs a masked value preprocessor, or a strategy for dealing with them, e.g. returning a default distance if the trajectory distance is nan

@cleong110 cleong110 changed the title Datasets to dataframes, run scoring, analyze scores Code from pose-eval paper May 28, 2025
@AmitMY
Copy link
Contributor

AmitMY commented May 30, 2025

All plotting, latex generation files, etc, should go in another directory, maybe analysis or something.
then i'll merge.

README should be overhauled to show the metrics and some charts from the paper maybe

then code cleanup

@cleong110
Copy link
Contributor Author

cleong110 commented Jun 4, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants