-
Notifications
You must be signed in to change notification settings - Fork 437
Open
Description
Describe the bug
The bigbench tasks failed with ValueError or KeyError because the evaluation_splits were set to ["validation"] or ["test"] improperly for some subsets which only had a default or train split.
To Reproduce
from datasets import load_dataset
task = "bigbench:mult_data_wrangling|5"
pipeline = Pipeline(
tasks=task,
pipeline_parameters=pipeline_params,
evaluation_tracker=evaluation_tracker,
model_config=model_config,
)
pipeline.evaluate()
pipeline.save_and_push_results()
pipeline.show_results()---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[3], line 29
18 # model_config = LiteLLMModelConfig(
19 # model_name="openai/gpt-4.1",
20 # base_url="https://api.openai.com/v1",
(...) 24 # )
25 # )
27 task = "bigbench:mult_data_wrangling|5"
---> 29 pipeline = Pipeline(
30 tasks=task,
31 pipeline_parameters=pipeline_params,
32 evaluation_tracker=evaluation_tracker,
33 model_config=model_config,
34 )
36 pipeline.evaluate()
37 pipeline.save_and_push_results()
File ~/Code/Evalhub/backend/lighteval/src/lighteval/pipeline.py:142, in Pipeline.__init__(self, tasks, pipeline_parameters, evaluation_tracker, model_config, model, metric_options)
140 # We init tasks first to fail fast if one is badly defined
141 self._init_random_seeds()
--> 142 self._init_tasks_and_requests(tasks=tasks)
144 self.model_config = model_config
145 self.accelerator, self.parallel_context = self._init_parallelism_manager()
...
88 available_suggested_splits = [
89 split for split in (Split.TRAIN, Split.TEST, Split.VALIDATION) if split in self
90 ]
KeyError: 'default'
Expected behavior
The configuration should point to existing splits in the Hugging Face dataset.
Version info
- OS: mac
- Lighteval version: main (local development)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels