RougeScoreStep is defined in eval.py:
:language: py
The configuration file, config.jsonnet, uses some advanced Jsonnet concepts like std.foldl
to create the same configuration for all 10 prompts:
You can run the experiment with:
tango run config.jsonnet -i eval -w /tmp/workspace