Commit 76c865c
authored
feat(byob): add explicit few-shot dataset support (#993)
## Summary
Adds explicit BYOB few-shot controls for benchmarks where a split-only
rewrite is not enough.
- Add `fewshot_dataset` to `@benchmark` so tasks can provide an exact
few-shot dataset URI/path, including filters, configs, `data_files`, and
other query params.
- Add `fewshot_prefix` to prepend static text before rendered few-shot
examples.
- Fix `--num-fewshot 0` so it overrides non-zero benchmark defaults and
enables true 0-shot validation.
- Save BYOB predictions by default from generated command templates.
- Expose `fewshot_dataset` and `fewshot_prefix` in generated FDF
`config.params.extra.dataset`.
- Update docs and tests for precedence, fallback, prefix rendering, and
explicit 0-shot behavior.
## Why
`fewshot_split` only works for simple datasets where changing `split` is
enough. Some datasets also require `filter_field` / `filter_value`,
`data_files`, configs, or other URI parameters. Reconstructing those
generically is fragile and can produce mixed-language or wrong-source
few-shot examples.
This change lets benchmark authors provide the exact few-shot source
when needed while preserving existing `fewshot_split` behavior.
## Test Plan
```bash
cd packages/nemo-evaluator
uv run python -m pytest \
tests/unit_tests/byob/test_byob_decorators.py::TestBenchmarkLogprobFields \
tests/unit_tests/byob/test_byob_eval_logic.py::TestFewshotPrefix \
tests/unit_tests/byob/test_byob_eval_logic.py::TestBuildFewshotExamples \
tests/unit_tests/byob/test_byob_compiler.py::TestBuildFdfHelper::test_fdf_groups_dataset_config_under_extra_dataset \
tests/unit_tests/byob/test_byob_runner.py::TestFewshotOverride
Signed-off-by: kanishks <kanishks@nvidia.com>1 parent 231526c commit 76c865c
10 files changed
Lines changed: 430 additions & 57 deletions
File tree
- docs/libraries/nemo-evaluator/extending/byob
- packages/nemo-evaluator
- src/nemo_evaluator/contrib/byob
- tests/unit_tests/byob
Lines changed: 3 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
| |||
Lines changed: 87 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
144 | 144 | | |
145 | 145 | | |
146 | 146 | | |
147 | | - | |
148 | | - | |
| 147 | + | |
| 148 | + | |
149 | 149 | | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
156 | 160 | | |
157 | 161 | | |
158 | 162 | | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
165 | 167 | | |
166 | 168 | | |
167 | 169 | | |
168 | | - | |
| 170 | + | |
| 171 | + | |
169 | 172 | | |
170 | 173 | | |
171 | 174 | | |
| |||
183 | 186 | | |
184 | 187 | | |
185 | 188 | | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
186 | 258 | | |
187 | 259 | | |
188 | 260 | | |
| |||
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
87 | 91 | | |
88 | 92 | | |
89 | 93 | | |
| |||
113 | 117 | | |
114 | 118 | | |
115 | 119 | | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
116 | 124 | | |
117 | 125 | | |
118 | 126 | | |
| |||
Lines changed: 16 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
| 95 | + | |
| 96 | + | |
95 | 97 | | |
96 | 98 | | |
97 | 99 | | |
| |||
170 | 172 | | |
171 | 173 | | |
172 | 174 | | |
| 175 | + | |
| 176 | + | |
173 | 177 | | |
174 | 178 | | |
175 | 179 | | |
| |||
209 | 213 | | |
210 | 214 | | |
211 | 215 | | |
212 | | - | |
213 | | - | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
214 | 224 | | |
215 | 225 | | |
216 | | - | |
| 226 | + | |
| 227 | + | |
217 | 228 | | |
218 | 229 | | |
219 | 230 | | |
| |||
308 | 319 | | |
309 | 320 | | |
310 | 321 | | |
| 322 | + | |
| 323 | + | |
311 | 324 | | |
312 | 325 | | |
313 | 326 | | |
| |||
Lines changed: 38 additions & 27 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
589 | 589 | | |
590 | 590 | | |
591 | 591 | | |
592 | | - | |
593 | | - | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
594 | 595 | | |
595 | 596 | | |
596 | 597 | | |
| |||
602 | 603 | | |
603 | 604 | | |
604 | 605 | | |
605 | | - | |
606 | | - | |
607 | 606 | | |
608 | | - | |
| 607 | + | |
| 608 | + | |
609 | 609 | | |
610 | 610 | | |
611 | 611 | | |
612 | 612 | | |
613 | 613 | | |
614 | 614 | | |
615 | 615 | | |
| 616 | + | |
616 | 617 | | |
617 | 618 | | |
618 | 619 | | |
619 | 620 | | |
620 | 621 | | |
621 | 622 | | |
622 | 623 | | |
623 | | - | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
624 | 628 | | |
625 | 629 | | |
626 | 630 | | |
627 | 631 | | |
628 | | - | |
| 632 | + | |
629 | 633 | | |
630 | 634 | | |
631 | 635 | | |
632 | 636 | | |
633 | 637 | | |
634 | 638 | | |
635 | | - | |
636 | | - | |
637 | | - | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
638 | 642 | | |
639 | 643 | | |
640 | 644 | | |
641 | 645 | | |
642 | 646 | | |
643 | | - | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
644 | 671 | | |
645 | 672 | | |
646 | 673 | | |
| |||
662 | 689 | | |
663 | 690 | | |
664 | 691 | | |
665 | | - | |
666 | | - | |
667 | | - | |
668 | | - | |
669 | | - | |
670 | | - | |
671 | | - | |
672 | 692 | | |
673 | 693 | | |
674 | 694 | | |
| |||
679 | 699 | | |
680 | 700 | | |
681 | 701 | | |
682 | | - | |
683 | | - | |
684 | 702 | | |
685 | 703 | | |
686 | 704 | | |
| |||
741 | 759 | | |
742 | 760 | | |
743 | 761 | | |
744 | | - | |
745 | | - | |
746 | | - | |
747 | | - | |
748 | | - | |
749 | 762 | | |
750 | 763 | | |
751 | 764 | | |
| |||
816 | 829 | | |
817 | 830 | | |
818 | 831 | | |
819 | | - | |
820 | | - | |
821 | 832 | | |
822 | 833 | | |
823 | 834 | | |
| |||
Lines changed: 12 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
747 | 747 | | |
748 | 748 | | |
749 | 749 | | |
750 | | - | |
| 750 | + | |
751 | 751 | | |
752 | 752 | | |
753 | | - | |
| 753 | + | |
754 | 754 | | |
755 | 755 | | |
756 | 756 | | |
| |||
781 | 781 | | |
782 | 782 | | |
783 | 783 | | |
784 | | - | |
785 | | - | |
786 | | - | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
787 | 787 | | |
788 | | - | |
789 | | - | |
790 | | - | |
791 | | - | |
792 | | - | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
793 | 794 | | |
794 | 795 | | |
795 | 796 | | |
| |||
801 | 802 | | |
802 | 803 | | |
803 | 804 | | |
| 805 | + | |
804 | 806 | | |
805 | 807 | | |
806 | 808 | | |
| |||
0 commit comments