Open
Description
TODO sotabench lib:
- remove benchmark() function from
benchmark.py
- move deps to requirements
- evaluation.json should be made if some ENV variable is set, otherwise pprint something
- for each benchmark:
- benchmark()
- default transform
- the dataset
- default parameters
- documentation:
- dataset examples
- default transform example
- input fed to model, and expected output
- link to examples of benchmarked models
- a library of transforms (maybe)
And additional requests:
BenchmarkResult
return value should also contain: 1) the dataset used, 2) the transform used, 3) input parameters used when invoking the function, 4) anything else - so it's a self-contained record of results
Metadata
Metadata
Assignees
Labels
No labels