File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change @@ -155,7 +155,8 @@ See [baselines.md](assets/baselines.md).
155155- ** RL-with-Cold-Start** : Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start. [ ![ [ code]] ( https://img.shields.io/github/stars/waltonfuture/RL-with-Cold-Start )] ( https://github.com/waltonfuture/RL-with-Cold-Start ) [ ![ [ arxiv]] ( https://img.shields.io/badge/arxiv-2505.22334-blue )] ( https://arxiv.org/pdf/2505.22334 )
156156- ** ViGoRL** : Grounded Reinforcement Learning for Visual Reasoning. [ ![ [ code]] ( https://img.shields.io/github/stars/Gabesarch/grounded-rl )] ( https://github.com/Gabesarch/grounded-rl ) [ ![ [ arxiv]] ( https://img.shields.io/badge/arxiv-2505.22334-blue )] ( https://arxiv.org/abs/2505.23678 )
157157- ** Revisual-R1** : Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning. [ ![ [ code]] ( https://img.shields.io/github/stars/CSfufu/Revisual-R1 )] ( https://github.com/CSfufu/Revisual-R1 ) [ ![ [ arxiv]] ( https://img.shields.io/badge/arxiv-2506.04207-blue )] ( https://arxiv.org/abs/2506.04207 )
158-
158+ - ** SophiaVL-R1** : Reinforcing MLLMs Reasoning with Thinking Reward. [ ![ [ code]] ( https://img.shields.io/github/stars/kxfan2002/SophiaVL-R1 )] ( https://github.com/kxfan2002/SophiaVL-R1 ) [ ![ [ arxiv]] ( https://img.shields.io/badge/arxiv-2505.17018-blue )] ( https://arxiv.org/abs/2505.17018 )
159+
159160## TODO
160161
161162- Support LoRA (high priority).
You can’t perform that action at this time.
0 commit comments