Open
Description
I spent a lot of time trying to get a custom metric working only to realize that it wasn't showing up in the output simply because https://github.com/LocalResearchGroup/llm-foundry/blob/a4bd9fc08f8dac482970d42a94ef2cdda2659a60/llmfoundry/command_utils/eval.py#L473-L474 looks for "Accuracy" in the metric name. Would it be possible to (1) develop a more robust way to recognize metrics, (2) treat everything in logger_keys
as a metric and record its name and value rather than assuming it is an "accuracy", or at least (3) make it more obvious why a metric without "Accuracy" in its name isn't showing up in the output?
I would be interested in developing a solution if we can align on the approach.
Metadata
Metadata
Assignees
Labels
No labels