Skip to content

Commit 0a37981

Browse files
chore(main): release 0.2.13 (#172)
🤖 I have created a release *beep* *boop* --- ## [0.2.13](v0.2.12...v0.2.13) (2026-02-26) ### Features * add Global MMLU task ([#174](#174)) ([0d0b227](0d0b227)) * add GoldenSwag task ([#175](#175)) ([a05e032](a05e032)) * add tasks from the OLMES evaluation suite ([#180](#180)) ([54f295d](54f295d)) * adding aggregated results with errors, if error free ration is &lt; 1.0 ([#181](#181)) ([6f3e639](6f3e639)) * BalancedCOPA dataset ([#177](#177)) ([25161aa](25161aa)) * Change to more complete revision of ZeroScrolls dataset ([#171](#171)) ([a4e117e](a4e117e)) * COPA uses appropriate dataset splits ([#176](#176)) ([55ebe44](55ebe44)) ### Bug Fixes * Change to more complete revision of zeroscrolls ([#173](#173)) ([a84286e](a84286e)) * Flores200 data reading issue ([#179](#179)) ([9bf3155](9bf3155)) ### Documentation * updated with info for release-please ([#162](#162)) ([cf38766](cf38766)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Philipp Siedler <p.d.siedler@gmail.com>
1 parent 69b2dbe commit 0a37981

File tree

4 files changed

+27
-3
lines changed

4 files changed

+27
-3
lines changed

.release-please-manifest.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
{
2-
".": "0.2.12"
2+
".": "0.2.13"
33
}

CHANGELOG.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,30 @@
1212

1313
### Bug Fixes
1414

15+
## [0.2.13](https://github.com/Aleph-Alpha-Research/eval-framework/compare/v0.2.12...v0.2.13) (2026-02-26)
16+
17+
18+
### Features
19+
20+
* add Global MMLU task ([#174](https://github.com/Aleph-Alpha-Research/eval-framework/issues/174)) ([0d0b227](https://github.com/Aleph-Alpha-Research/eval-framework/commit/0d0b22789b7817e120831cf688f0dd2aca84c1d8))
21+
* add GoldenSwag task ([#175](https://github.com/Aleph-Alpha-Research/eval-framework/issues/175)) ([a05e032](https://github.com/Aleph-Alpha-Research/eval-framework/commit/a05e0325e09c2ea0e5bf20284fff4428c7d126ab))
22+
* add tasks from the OLMES evaluation suite ([#180](https://github.com/Aleph-Alpha-Research/eval-framework/issues/180)) ([54f295d](https://github.com/Aleph-Alpha-Research/eval-framework/commit/54f295d7d82e71ba80d34b8f6758efc29bf27dd0))
23+
* adding aggregated results with errors, if error free ration is &lt; 1.0 ([#181](https://github.com/Aleph-Alpha-Research/eval-framework/issues/181)) ([6f3e639](https://github.com/Aleph-Alpha-Research/eval-framework/commit/6f3e6397f65fa7be45bbcb6ff248cc2f8097f5fb))
24+
* BalancedCOPA dataset ([#177](https://github.com/Aleph-Alpha-Research/eval-framework/issues/177)) ([25161aa](https://github.com/Aleph-Alpha-Research/eval-framework/commit/25161aaab9acbc549997227cefa181414a368799))
25+
* Change to more complete revision of ZeroScrolls dataset ([#171](https://github.com/Aleph-Alpha-Research/eval-framework/issues/171)) ([a4e117e](https://github.com/Aleph-Alpha-Research/eval-framework/commit/a4e117eaf4c4fc3ad8bfbffb9b5aaf737ed78dbe))
26+
* COPA uses appropriate dataset splits ([#176](https://github.com/Aleph-Alpha-Research/eval-framework/issues/176)) ([55ebe44](https://github.com/Aleph-Alpha-Research/eval-framework/commit/55ebe446789e47e834f03bb62d49a3095c692026))
27+
28+
29+
### Bug Fixes
30+
31+
* Change to more complete revision of zeroscrolls ([#173](https://github.com/Aleph-Alpha-Research/eval-framework/issues/173)) ([a84286e](https://github.com/Aleph-Alpha-Research/eval-framework/commit/a84286ea0f1d446b548087eb306ffbaeb06bd0e6))
32+
* Flores200 data reading issue ([#179](https://github.com/Aleph-Alpha-Research/eval-framework/issues/179)) ([9bf3155](https://github.com/Aleph-Alpha-Research/eval-framework/commit/9bf31551cce821fccf229e936aa8beb79046fcc7))
33+
34+
35+
### Documentation
36+
37+
* updated with info for release-please ([#162](https://github.com/Aleph-Alpha-Research/eval-framework/issues/162)) ([cf38766](https://github.com/Aleph-Alpha-Research/eval-framework/commit/cf3876635af004102badb935360efbf840087824))
38+
1539
## [0.2.12](https://github.com/Aleph-Alpha-Research/eval-framework/compare/v0.2.11...v0.2.12) (2026-02-04)
1640

1741

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "eval-framework"
3-
version = "0.2.12"
3+
version = "0.2.13"
44
description = "Evalulation Framework"
55
readme = "README.md"
66
license = { file = "LICENSE" }

uv.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)