-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Hi,
I have been trying to test several models on the GEM-benchmark metrics. I followed the tutorial provided both on GitHub and the official website and have been able to generate a submission file with generations and GEM-ID keys. However, when I attempt to generate output scores, I notice that I am missing several scores I wish to have.
For example, in the requirements.txt file, the package of rouge-score is included, however, the output scores do not contain any rouge metric. Furthermore, I attempted several times to generate output scores with --heavy-metric flag, however, this is always skipped. Regardless of whether I include the flag or leave it out, the same metrics are returned.
I attached an example of my output scores below:

An example of the generation is shown below here:

More information:
- I cloned the repo in my google drive, cd'd in the file and pip installed both the normal requirements file as heavy requirements. I did this several times ensuring that everything was installed
- I also tried generating the metrics by manually choosing the metrics with --metric-list, but that did not work either
- I attempted to pip import gem_metrics, but this did not resolve my issues.
Could someone help uncover what I am doing wrong?
Kind regards