Skip to content

Commit cc80f47

Browse files
committed
Update README.md
1 parent 435403b commit cc80f47

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,18 @@ bash reward_generation/mt_score_generate.sh \
140140
--loop 1
141141
```
142142

143+
Generate reasoning data
144+
145+
```bash
146+
# example of math
147+
python rationale_generation/process.py \
148+
--model_path "Qwen/QwQ-32B" \
149+
--data_path _output/monte_carlo_processed/math_train_Qwen2.5-Math-7B-Instruct_monte_carlo \
150+
--save_path _output/reasoning_output/math_train_QwQ_reasoning \
151+
--num_gpu_per 1 \
152+
--majority_of_N 1
153+
```
154+
143155
### Critique-refinement
144156

145157
Execute policy refinement based on GenPRM's split output

0 commit comments

Comments
 (0)