Skip to content

Commit 54b4453

Browse files
PauBadiaMclaude
andcommitted
Fix benchmark test: use "class" groupby for source-level eval
The index-alignment fix in _tensor_truth exposed that metrics5 was accidentally passing due to misaligned ground truth. With groupby="group", every source within each group has homogeneous perturbation (all 1s or all 0s), producing no metrics. Using "class" (which mixes groups A and B) ensures heterogeneous ground truth for meaningful source-level evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent a064295 commit 54b4453

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

tests/bm/test_benchmark.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
[["auc"], None, "expr", False, 0.05, 5, False],
1414
[["auc", "fscore"], "group", "expr", False, 0.05, 5, False],
1515
[["auc", "fscore", "qrank"], None, "source", False, 0.05, 2, False],
16-
[["auc", "fscore", "qrank"], "group", "source", False, 0.05, 1, False],
16+
[["auc", "fscore", "qrank"], "class", "source", False, 0.05, 1, False],
1717
[["auc", "fscore", "qrank"], "bm_group", "expr", True, 0.05, 5, False],
1818
[["auc", "fscore", "qrank"], "source", "expr", True, 0.05, 5, False],
1919
],

0 commit comments

Comments
 (0)