You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source_en/Instruction/GRPO/DeveloperGuide/reward_function.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -80,6 +80,8 @@ Swift supports using both synchronous and asynchronous reward functions simultan
80
80
- Synchronous reward functions are executed sequentially
81
81
- Asynchronous reward functions are executed in parallel using `asyncio.gather`
82
82
83
+
The [plugin](https://github.com/modelscope/ms-swift/blob/main/examples/train/grpo/plugin/plugin.py) file provides an example of a generative reward model (async_genrm) that calls the `swift deploy` service.
84
+
83
85
## Built-in Reward Functions
84
86
Swift includes five rule-based reward functions (code can be found in swift/plugin/orm.py).
0 commit comments