[Diloco] add diloco related utils #2552

khatwanimohit · 2025-10-28T00:42:22Z

Description

All the credit of this PR goes to @jonb377 and @ZacharyGarrett !

This PR introduces Distributed Low-Communication (DiLoCo) training, a technique to reduce communication overhead in
distributed model training. It achieves this by synchronizing model parameters periodically, rather than at every step,
improving efficiency for large models.

Notice 1: Once all tests pass, the "pull ready" label will automatically be assigned.
This label is used for administrative purposes. Please do not add it manually.

Notice 2: For external contributions, our settings currently require an approval from a MaxText maintainer to trigger CI tests.

Tests

a unit test with a simple model, more tests with trainer will come in an upcoming PR

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

github-actions · 2025-10-29T20:10:12Z

🤖 Hi @khatwanimohit, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

github-actions · 2025-10-30T17:28:05Z

🤖 Hi @khatwanimohit, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

github-actions

📋 Review Summary

This pull request introduces Distributed Low-Communication (DiLoCo) training, a technique for efficient distributed training of large models. The implementation looks solid and is accompanied by a comprehensive unit test. The changes to configuration and utility functions are appropriate for integrating this new feature.

🔍 General Feedback

The core logic in src/MaxText/diloco.py is well-structured and follows the principles outlined in the referenced papers.
The addition of a detailed unit test in tests/diloco_test.py is excellent and greatly helps in verifying the correctness of the implementation.
One potential issue was identified in the sharding logic, for which a suggestion has been provided.

Overall, this is a great contribution that adds a valuable feature to MaxText.

src/MaxText/diloco.py

github-actions · 2025-11-04T19:12:58Z

🤖 Hi @khatwanimohit, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

github-actions

📋 Review Summary

This Pull Request introduces Distributed Low-Communication (DiLoCo) training utilities and integrates them into the MaxText configuration. The implementation appears sound, and the accompanying unit tests provide good coverage for the core functionality.

🔍 General Feedback

The addition of drjax and other dependency updates are appropriate for the new feature.
The configuration changes in base.yml and pyconfig.py correctly expose and handle the new DiLoCo parameters.
The new diloco.py module is well-structured and implements the DiLoCo algorithm effectively.
The diloco_test.py provides a thorough simulation of the DiLoCo training process, with clear explanations of expected values.

src/MaxText/diloco.py

khatwanimohit requested review from A9isha, NuojCheng, RissyRan, SurbhiJainUSC, aireenmei, bvandermoon, gagika, gobbleturk, hengtaoguo, jiangjy1982, richjames0, shralex, suexu1025 and vipannalla as code owners October 28, 2025 00:42

khatwanimohit changed the title ~~[Diloco] add diloco related utils~~ [WIP][Diloco] add diloco related utils Oct 28, 2025

khatwanimohit force-pushed the mohit/diloco_utils branch 9 times, most recently from 534c6c9 to 6dc6108 Compare October 29, 2025 20:02

khatwanimohit requested a review from parambole as a code owner October 29, 2025 20:02

khatwanimohit force-pushed the mohit/diloco_utils branch from 6dc6108 to b7e0d76 Compare October 29, 2025 20:03

khatwanimohit requested a review from jacoguzo as a code owner October 29, 2025 20:03

khatwanimohit force-pushed the mohit/diloco_utils branch from b7e0d76 to 868bb4e Compare October 29, 2025 20:04

khatwanimohit added the gemini-review label Oct 29, 2025

khatwanimohit changed the title ~~[WIP][Diloco] add diloco related utils~~ [Diloco] add diloco related utils Oct 29, 2025

khatwanimohit force-pushed the mohit/diloco_utils branch from 868bb4e to 5494165 Compare October 30, 2025 17:27

khatwanimohit added gemini-review and removed gemini-review labels Oct 30, 2025

github-actions bot reviewed Oct 30, 2025

View reviewed changes

src/MaxText/diloco.py Outdated Show resolved Hide resolved

khatwanimohit force-pushed the mohit/diloco_utils branch 2 times, most recently from c5281c1 to 3aac2cc Compare November 4, 2025 19:11

khatwanimohit added gemini-review and removed gemini-review labels Nov 4, 2025

github-actions bot reviewed Nov 4, 2025

View reviewed changes

src/MaxText/diloco.py Outdated Show resolved Hide resolved

src/MaxText/diloco.py Show resolved Hide resolved

src/MaxText/diloco.py Show resolved Hide resolved

khatwanimohit force-pushed the mohit/diloco_utils branch from 3aac2cc to 4a04bd3 Compare November 6, 2025 19:18

khatwanimohit requested a review from NicoGrande as a code owner November 6, 2025 19:18

khatwanimohit force-pushed the mohit/diloco_utils branch from 4a04bd3 to d968f41 Compare November 6, 2025 19:25

diloco utils

52bb4d4

khatwanimohit force-pushed the mohit/diloco_utils branch from d968f41 to 52bb4d4 Compare November 6, 2025 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Diloco] add diloco related utils #2552

[Diloco] add diloco related utils #2552

Uh oh!

khatwanimohit commented Oct 28, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

github-actions bot commented Nov 4, 2025

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[Diloco] add diloco related utils #2552

Are you sure you want to change the base?

[Diloco] add diloco related utils #2552

Uh oh!

Conversation

khatwanimohit commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

github-actions bot commented Oct 29, 2025

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

📋 Review Summary

🔍 General Feedback

Uh oh!

Uh oh!

github-actions bot commented Nov 4, 2025

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

📋 Review Summary

🔍 General Feedback

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

khatwanimohit commented Oct 28, 2025 •

edited

Loading