What's Changed
- Update pearsonr tests by @elronbandel in #1890
- return source_to_recipe to performance evaluation, once 403 is fixed by bnayahu by @dafnapension in #1891
- remove a card whose preprocess_steps do not match the contents of the loaded dataset by @dafnapension in #1893
- fix an ineffective setting of max size of loader_cache by @dafnapension in #1892
- Fix compatibility with datasets 4.0 by @elronbandel in #1861
- Improve speed in mmlu global by @elronbandel in #1895
- Remove the need for datasets<4.0.0 by @elronbandel in #1897
- Refresh README by @elronbandel in #1898
- Update Readme by @elronbandel in #1899
- Update README by @elronbandel in #1900
- Update README by @elronbandel in #1901
- Fix docs and example of how to use benchmark by @elronbandel in #1903
- Refine condition for avoiding the Benchmark wrapper by @bnayahu in #1904
- Complete transition to datasets 4.0.0 in preparation tests by @dafnapension in #1902
- Make sacrebleu faster and more efficient by @elronbandel in #1906
- Implements LogProbEngine on CrossInference and adds more granite guardian models by @martinscooper in #1905
- Remove IBM GenAI support and moved legacy GenAI metrics to use CrossProviderInferenceEngine by @yoavkatz in #1508
- GPT on rits and minor llm judge criteria changes by @martinscooper in #1909
- The special installation of networkx can be removed as well by @dafnapension in #1908
- Update version to 1.26.6 by @elronbandel in #1911
Full Changelog: 1.26.5...1.26.6