Skip to content

Conversation

@mydatascience
Copy link
Collaborator

@mydatascience mydatascience commented Oct 21, 2025

Description

Refactoring how to run GRPO with Tunix MaxText vLLM and Pathways (TMVP). We are adding the new files under src/MaxText/rl and the config file rl.yml in src/MaxText/configs. The entry level main file is src/MaxText/rl/train_rl.py
Work was started by @mydatascience and completed by @A9isha

Tests

I have tested the code on Llama3.1-8b and Llama3.1-70b on v7x-128, v5p-64

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@github-actions
Copy link

🤖 Hi @A9isha, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Collaborator

@xuefgu xuefgu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @A9isha !

In addition to the comments - can you please clarify what tests you have performed, i.e. the hardware, model, and more importantly with what configs (since that's the main change here)?

@A9isha A9isha force-pushed the universal_grpo branch 2 times, most recently from d521428 to 13609f1 Compare November 5, 2025 23:25
Copy link
Collaborator

@A9isha A9isha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dummy approval since an owner approval is needed :D

@copybara-service copybara-service bot merged commit e3ddb1a into main Nov 6, 2025
45 checks passed
@copybara-service copybara-service bot deleted the universal_grpo branch November 6, 2025 01:03
hengtaoguo pushed a commit that referenced this pull request Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants