-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
File "/home/zhangkailin/.conda/envs/verl3/lib/python3.12/site-packages/ray/_private/worker.py", line 968, in get_objects
[2026-03-11 17:47:54] raise value.as_instanceof_cause()
[2026-03-11 17:47:54] ray.exceptions.RayTaskError(NotImplementedError): �[36mray::TaskRunner.run()�[39m (pid=2991687, ip=10.21.175.69, actor_id=aa12428bb212c26e78160ab001000000, repr=<main_ppo.TaskRunner object at 0x7f89aca2aba0>)
[2026-03-11 17:47:54] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-11 17:47:54] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-11 17:47:54] File "/appdata/zhangkailin/verl/verl/trainer/main_ppo.py", line 317, in run
[2026-03-11 17:47:54] trainer.fit()
[2026-03-11 17:47:54] File "/appdata/zhangkailin/verl/verl/trainer/ppo/ray_trainer.py", line 947, in fit
[2026-03-11 17:47:54] val_metrics = self._validate()
[2026-03-11 17:47:54] ^^^^^^^^^^^^^^^^
[2026-03-11 17:47:54] File "/appdata/zhangkailin/verl/verl/trainer/ppo/ray_trainer.py", line 598, in _validate
[2026-03-11 17:47:54] result = self.val_reward_fn(test_batch, return_dict=True)
[2026-03-11 17:47:54] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-11 17:47:54] File "/appdata/zhangkailin/verl/verl/workers/reward_manager/naive.py", line 89, in call
[2026-03-11 17:47:54] score = self.compute_score(
[2026-03-11 17:47:54] ^^^^^^^^^^^^^^^^^^^
[2026-03-11 17:47:54] File "/appdata/zhangkailin/verl/verl/utils/reward_score/init.py", line 107, in default_compute_score
[2026-03-11 17:47:54] raise NotImplementedError(f"Reward function is not implemented for {data_source=}")
NotImplementedError: Reward function is not implemented for data_source='openai/gsm8k'