We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 767d6b7 commit 43a9ce2Copy full SHA for 43a9ce2
README.md
@@ -122,6 +122,7 @@ python utils/split_dataset.py \
122
Generate step-by-step outputs of PRM
123
124
```bash
125
+# example of processbench
126
python prm_evaluation/prm_evaluate.py \
127
--reward_name_or_path "GenPRM/GenPRM-7B"\
128
--data_path "_data/split_input/ProcessBench"\
@@ -137,8 +138,8 @@ Execute policy refinement based on GenPRM's split output
137
138
139
python prm_evaluation/policy_refine.py \
140
--model_path "Qwen/Qwen2.5-7B-Instruct" \
- --data_path "_output/split_output/ProcessBench"\
141
- --split_out "_output/split_refine/ProcessBench"
+ --data_path "_output/split_output/..."\
142
+ --split_out "_output/split_refine/..."
143
```
144
145
0 commit comments