Open
Description
I tried running the Qwen-Math models instead of the Qwen models, but find the evaluation scripts don't work for Qwen-Math
ValueError: User-specified max_model_len (32768) is greater than the derived max_model_len (max_position_embeddings=4096 or model_max_length=None in model's config.json). This may lead to incorrect model outputs or CUDA errors. To allow overriding this maximum, set the env var VLLM_ALLOW_LONG_MAX_MODEL_LEN=1
MODEL=Qwen/Qwen2.5-Math-1.5B-Instruct
MODEL=Qwen/Qwen2.5-Math-1.5B
NUM_GPUS=1
MODEL_ARGS="pretrained=$MODEL,dtype=float16,data_parallel_size=$NUM_GPUS,max_model_length=32768,gpu_memory_utilisation=0.8"
TASK=math_500
OUTPUT_DIR=data/evals/$MODEL
CUDA_VISIBLE_DEVICES=1 lighteval vllm $MODEL_ARGS "custom|$TASK|0|0" \
--custom-tasks src/open_r1/evaluate.py \
--use-chat-template \
--system-prompt="Please reason step by step, and put your final answer within \boxed{}." \
--output-dir $OUTPUT_DIR
Hope you could help me out.
Metadata
Metadata
Assignees
Labels
No labels