Hi,
Thank you for your great work. I am using your Docker image (docker://ghcr.io/allenai/olmo-core:tch270cu128-v2.1.0) and trying to launch a pretraining run for the 7B model (src/scripts/official/OLMo3/OLMo-3-1025-7B-pretrain-1.py). However, I encountered errors during the downstream evaluation step. The error message says that the tasks do not exist, so I checked list_tasks in the olmo_eval package — but none of the listed tasks have a corresponding implementation available.
Could this be a configuration issue?
Below is my code for checking task availability:
from olmo_eval import list_tasks
all_tasks = list_tasks()
print(f"Total tasks available: {len(all_tasks)}\n")
needed_tasks = [
"arc_challenge_test_bpb_5shot",
"arc_challenge_test_mc_5shot_fast",
"arc_easy_test_bpb_5shot",
"arc_easy_test_mc_5shot_fast",
"hellaswag_bpb_5shot",
"mmlu_humanities_test_bpb_5shot",
"mmlu_humanities_test_mc_5shot_fast",
"mmlu_other_test_bpb_5shot",
"mmlu_other_test_mc_5shot_fast",
"mmlu_social_sciences_test_bpb_5shot",
"mmlu_social_sciences_test_mc_5shot_fast",
"mmlu_stem_test_bpb_5shot",
"mmlu_stem_test_mc_5shot_fast",
# Basic Skills
"basic_skills_arithmetic_rc_5shot",
"basic_skills_coding_rc_5shot",
"basic_skills_common_knowledge_rc_5shot",
"basic_skills_logical_reasoning_rc_5shot",
"basic_skills_pattern_rc_5shot",
"basic_skills_string_operations_rc_5shot",
# Gen tasks BPB
"codex_humaneval_gold_bpb_3shot",
"codex_mbpp_gold_bpb_3shot",
"minerva_math_500_gold_bpb_0shot",
"mt_mbpp_cpp_gold_bpb_3shot",
"mt_mbpp_java_gold_bpb_3shot",
"mt_mbpp_rust_gold_bpb_3shot",
# Sanity check for MCQA ability
"copycolors_10way_fast",
]
print("Checking for needed tasks:")
for task in needed_tasks:
status = "✓ FOUND" if task in all_tasks else "✗ MISSING"
print(f" {status}: {task}")
Total tasks available: 167
Checking for needed tasks:
✗ MISSING: arc_challenge_test_bpb_5shot
✗ MISSING: arc_challenge_test_mc_5shot_fast
✗ MISSING: arc_easy_test_bpb_5shot
✗ MISSING: arc_easy_test_mc_5shot_fast
✗ MISSING: hellaswag_bpb_5shot
✗ MISSING: mmlu_humanities_test_bpb_5shot
✗ MISSING: mmlu_humanities_test_mc_5shot_fast
✗ MISSING: mmlu_other_test_bpb_5shot
✗ MISSING: mmlu_other_test_mc_5shot_fast
✗ MISSING: mmlu_social_sciences_test_bpb_5shot
✗ MISSING: mmlu_social_sciences_test_mc_5shot_fast
✗ MISSING: mmlu_stem_test_bpb_5shot
✗ MISSING: mmlu_stem_test_mc_5shot_fast
✗ MISSING: basic_skills_arithmetic_rc_5shot
✗ MISSING: basic_skills_coding_rc_5shot
✗ MISSING: basic_skills_common_knowledge_rc_5shot
✗ MISSING: basic_skills_logical_reasoning_rc_5shot
✗ MISSING: basic_skills_pattern_rc_5shot
✗ MISSING: basic_skills_string_operations_rc_5shot
✗ MISSING: codex_humaneval_gold_bpb_3shot
✗ MISSING: codex_mbpp_gold_bpb_3shot
✗ MISSING: minerva_math_500_gold_bpb_0shot
✗ MISSING: mt_mbpp_cpp_gold_bpb_3shot
✗ MISSING: mt_mbpp_java_gold_bpb_3shot
✗ MISSING: mt_mbpp_rust_gold_bpb_3shot
✗ MISSING: copycolors_10way_fast
Hi,
Thank you for your great work. I am using your Docker image (docker://ghcr.io/allenai/olmo-core:tch270cu128-v2.1.0) and trying to launch a pretraining run for the 7B model (src/scripts/official/OLMo3/OLMo-3-1025-7B-pretrain-1.py). However, I encountered errors during the downstream evaluation step. The error message says that the tasks do not exist, so I checked list_tasks in the olmo_eval package — but none of the listed tasks have a corresponding implementation available.
Could this be a configuration issue?
Below is my code for checking task availability:
Total tasks available: 167 Checking for needed tasks: ✗ MISSING: arc_challenge_test_bpb_5shot ✗ MISSING: arc_challenge_test_mc_5shot_fast ✗ MISSING: arc_easy_test_bpb_5shot ✗ MISSING: arc_easy_test_mc_5shot_fast ✗ MISSING: hellaswag_bpb_5shot ✗ MISSING: mmlu_humanities_test_bpb_5shot ✗ MISSING: mmlu_humanities_test_mc_5shot_fast ✗ MISSING: mmlu_other_test_bpb_5shot ✗ MISSING: mmlu_other_test_mc_5shot_fast ✗ MISSING: mmlu_social_sciences_test_bpb_5shot ✗ MISSING: mmlu_social_sciences_test_mc_5shot_fast ✗ MISSING: mmlu_stem_test_bpb_5shot ✗ MISSING: mmlu_stem_test_mc_5shot_fast ✗ MISSING: basic_skills_arithmetic_rc_5shot ✗ MISSING: basic_skills_coding_rc_5shot ✗ MISSING: basic_skills_common_knowledge_rc_5shot ✗ MISSING: basic_skills_logical_reasoning_rc_5shot ✗ MISSING: basic_skills_pattern_rc_5shot ✗ MISSING: basic_skills_string_operations_rc_5shot ✗ MISSING: codex_humaneval_gold_bpb_3shot ✗ MISSING: codex_mbpp_gold_bpb_3shot ✗ MISSING: minerva_math_500_gold_bpb_0shot ✗ MISSING: mt_mbpp_cpp_gold_bpb_3shot ✗ MISSING: mt_mbpp_java_gold_bpb_3shot ✗ MISSING: mt_mbpp_rust_gold_bpb_3shot ✗ MISSING: copycolors_10way_fast