Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uniform uniform distribute_buffer_sizes for calculating buff size #94

Closed
wants to merge 3 commits into from

Conversation

namgyu-youn
Copy link
Contributor

@namgyu-youn namgyu-youn commented Mar 24, 2025

In shampoo_ddp_distributor.py & shampoo_hsdp_distributor.py & shampoo_hybrid_shard_distributor.py, codes for same logics are implemented.

So i uniformed them into shampoo_utils.py, and cleaned some duplicate code.

- Same logics are implemented in following files. So it is uniformed into shampoo_utils
- Files : shampoo_ddp_distributor.py, shampoo_hsdp_distributor.py, shampoo_hybrid_shard_distributor.py
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 24, 2025
@@ -151,3 +153,69 @@ def __exit__(
exc_tb: TracebackType | None,
) -> None:
self._exit_method()


def distribute_buffer_sizes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We require 100% test coverage, and this function is not tested if we moved to here. https://github.com/facebookresearch/optimizers/blob/main/distributed_shampoo/utils/tests/shampoo_utils_test.py is the file you could add your tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then may I add some test module in shampoo_utils_test.py for testing distribute_buffer_sizes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then may I add some test module in shampoo_utils_test.py for testing distribute_buffer_sizes?

Yes, because distribute_buffer_sizes() is a public function now used for others; as a result, we want to test it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. I fixed buffer_size_rank, and builded testing module in shampoo_utuls.py. Could you review it??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codes look good to me, and could you fix the type errors listed in https://github.com/facebookresearch/optimizers/actions/runs/14051552340/job/39345812119?pr=94? You might want to rebase your repo first.

Copy link
Contributor Author

@namgyu-youn namgyu-youn Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type-check.yml can't independently work because PATH is not configured.

@namgyu-youn, could you elaborate on this?

I tested GitHub Actions using act. The commander is: act -j mypy -w .github/workflows/type-check.yml

But it failed because act failed to find the local path of ⁣uv. So I added some PATH for forced configuration in type-check.yml, and it worked successfully.

But I am not certain about this change because:

  • Contributors could configure the local path themselves.
  • Specified PATH only works in ubuntu (not macOS or Windows)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks. This appears to be a compatibility issue with act since the workflow does work remotely. I don't know act enough to know how to best resolve this, but I think it shouldn't be necessary to explicitly specify PATH.

The recommended workflow is to use the make commands locally before committing, i.e. in this case make type-check (this assumes you installed via a variation of pip install -e ".[dev]", see here).

Copy link
Contributor Author

@namgyu-youn namgyu-youn Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks. This appears to be a compatibility issue with act since the workflow does work remotely. I don't know act enough to know how to best resolve this, but I think it shouldn't be necessary to explicitly specify PATH.

The recommended workflow is to use the make commands locally before committing, i.e. in this case make type-check (this assumes you installed via a variation of pip install -e ".[dev]", see here).

In my thought, two tools are useful in the following case:

  • make : It is specialized for maintainers (auto-build). (e.g., docker image release)
  • act : It is specialized for contributors (local test). (e.g., pre-commit, pytes )

So I want to carefully suggest this update would be helpful. But as you mentioned, CONTRIBUTING.md explicitly requires make, so I would follow your decision.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want you can create a separate PR to ensure compatibility with act by refactoring the workflows since I won't prioritise this anytime soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I will open PR after this. @tsunghsienlee could you approve and merge #96 ? I opened rebased PR for this.

- Update buffer_size_ranks for distribute_buffer_size in utils
- Build test module for distribute_buffer_size
@namgyu-youn namgyu-youn closed this by deleting the head repository Mar 25, 2025
@namgyu-youn
Copy link
Contributor Author

@tsunghsienlee I opened a rebased PR for this. Please check #96

facebook-github-bot pushed a commit that referenced this pull request Mar 25, 2025
Summary:
This is the rebased PR for #94 (already approved)
- close : #93

Pull Request resolved: #96

Reviewed By: anana10c

Differential Revision: D71830614

Pulled By: tsunghsienlee

fbshipit-source-id: 4637536d526a69e5e4631f6493658cc979be3ae4
namgyu-youn added a commit to namgyu-youn/optimizers that referenced this pull request Mar 25, 2025
- When testing type-check.yaml alone using act, it failed because act failed to find local path of uv.
- Fixed PATH helps uv finding local path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Uniform common utility function : `_distribute_buffer_sizes
4 participants