GSPO vs GRPO/DAPO #3335

rhinosaur0 · 2025-09-04T02:40:32Z

rhinosaur0
Sep 4, 2025

GSPO theoretically makes much more sense than GRPO/DAPO, weighing each token in the sequence equally rather than weighing them differently due to individual log-prob. If so, how was the "noisy" GRPO still able to achieve results, while GSPO didn't achieve that much better results?

I'm curious to hear some explanations!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GSPO vs GRPO/DAPO #3335

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

GSPO vs GRPO/DAPO #3335

Uh oh!

rhinosaur0 Sep 4, 2025

Replies: 0 comments

rhinosaur0
Sep 4, 2025