Description
Currently, evaluation.yaml
exists under the configs/
directory. To start, we wanted to just showcase this recipes as an example, but it is a core part of the finetuning process and therefore should mirror the pattern we've established for other configs in which they reside under model-specific directories.
The change for each model directory will consist of four steps:
- Copy
evaluation.yaml
under whichever model you are focused upon. - Update the defaults from
llama2
to the current model defaults. - Update the _recipe_registry.py to make sure the new YAML file can be found with the following command:
tune run eleuther_eval --config MODEL/evaluation
- Put up a PR with output from running the evaluation script. Here's an example for Qwen2: Add evaluation configs under qwen2 dir #1809
If there are multiple sizes of model that exist in the directory, select the most commonly used one. This is certainly up for interpretation, but typically ~7B params is standard. We want to give a good example, but there's no need to proliferate configs for every model SIZE.
Checklist:
- Llama2
- Code-Llama2 (Add evaluation file for code_llama2 model #2209, thanks @ReemaAlzaid)
- Llama3
- Llama3.1
- Llama3.2 (Llama3.2 3B eval #2186, thanks @ReemaAlzaid)
- Llama3.2V
- Mistral (1810 Move mistral evaluation #1829, thanks @Yousof-kayal)
- Phi3 (1810 Add evaluation configs under phi3 dir #1822, thanks @Harthi7)
- Gemma (1810 move gemma evaluation #1819, thanks @malinjawi)
- Gemma2
- Qwen2
- Qwen2.5 (Add eval config for QWEN2_5 model using 0.5B variant #2230, thanks @Ankur-singh)
After all of these are completed, we will deprecate the evaluation.yaml
configs in the base configs directory.
Thanks, everyone, for your help! 🎉