File tree Expand file tree Collapse file tree
opencompass/datasets/subjective Expand file tree Collapse file tree Original file line number Diff line number Diff line change 3333 [],
3434)
3535
36- # MTBench101 / WildBench 的 judge 后处理依赖固定格式([[score]]、"choice": "A++" 等),
37- # mock --type choice 只返回 "A",无法解析 → 空 references / ZeroDivisionError。
38- # datasets += mtbench101_datasets
39- # datasets += wildbench_datasets
36+ datasets += mtbench101_datasets
37+ datasets += wildbench_datasets
4038
4139eval = dict (
4240 partitioner = dict (type = SubjectiveNaivePartitioner ,
Original file line number Diff line number Diff line change @@ -188,6 +188,14 @@ def arenahard_postprocess(
188188 references ,
189189 )
190190
191+ if battles .empty or 'model_a' not in battles .columns :
192+ return {
193+ 'warning' :
194+ 'no valid arena-hard judgements (expect [[A>B]] etc. in judge output)' ,
195+ 'score' : 0 ,
196+ 'details' : output ,
197+ }
198+
191199 bootstrap_online_elo = compute_mle_elo (battles )
192200
193201 np .random .seed (42 )
You can’t perform that action at this time.
0 commit comments