Skip to content

Post-Study technical debt #32

@cleong110

Description

@cleong110

After the recent pose-evaluation study, we have a good amount of cleanup, PRs, and technical debt. To whit:

Things to Merge into main

  • Merge the Pose metrics we used, especially the ones from the final round of 48 or 1200,
  • Merge the Embedding Metrics
  • Merge any pose processors we used
  • The metric names and metric signatures are long and unwieldy. Perhaps we can add Improved automatic metric names as part of the classes themselves? e.g. "DTW + Trim". Some code for this is in the "interpret_names.py"
  • Add appropriate tests for all the above

Evaluation/Trial code

  • Phase out the use of glosses in filenames, some of them have spaces, slashes, or other unusual characters in them.
  • Switch to parquet generally, CSV files can cause issues e.g. datatypes. For example with "TRUE" gloss turning a column to bool, or commas or spaces in glosses, etc etc.
  • Our "create metrics" script is messy, and also we use a "generate all possible metrics and then filter" approach. We may wish to merge or refactor the "create metrics"
  • the evaluation script: "load_splits_and_run_metrics" should potentially be broken up into two: one for full matrix, one for "target + confuser" trials
  • the evaluation script solution for ensuring consistency between metrics involves loading a "{gloss}_in.csv" and "{gloss}_out.csv", but (a) that currently crashes if you don't already have one, so you need to be sure to comment that out and run all glosses first and (b) we can't use glosses in filenames or we run into issues with things like "wash_dishes" being parsed as "wash" and "dishes_in"
  • Much of the code uses absolute paths e.g. to data files like /opt/home/cleong/data/ASL_Citizen/poses/pose/000017451997373907346-LIBRARY.pose
  • Those absolute paths come from a set of "dataset_parsing" scripts custom-written for ASL Citizen, Sem-Lex, and PopSign ASL, which basically join big lists of files with the dataset's metadata into a big DataFrame, stored as a .csv in "dataset_dfs" folder. These need overhauling to make the process simpler and more consistent. Perhaps by downloading the .pose files and embeddings from somewhere?

The dataset parsing/loading for ASL Citizen, Sem-Lex and PopSign ASL works, but has a number of issues, mostly do do with gloss vocabularies:

Embeddings:

  • The evaluation code has a bit of a kludge where: if it is an embedding metric, it goes and loads the appropriate embedding.
  • Fill in missing embeddings: Some of the poses could not be embedded, as they are over 300 frames long. Thus, embeddings are missing! I have a process to delete scores by copying only the score files with the 169 glosses to a new folder, deleting score files for metrics that don't have all 169 glosses, convert to pyarrow with the load_parquets script, then use my load_pyarrow script which filters out hyps/refs that aren't in all metrics.

Reproducibility issues:

The branch in question: https://github.com/cleong110/pose-evaluation/tree/datasets_to_dataframes/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions