Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nova builds CUDA wheels on GPU runner #5635

Open
huydhn opened this issue Sep 10, 2024 · 0 comments
Open

Nova builds CUDA wheels on GPU runner #5635

huydhn opened this issue Sep 10, 2024 · 0 comments

Comments

@huydhn
Copy link
Contributor

huydhn commented Sep 10, 2024

A timeout error when building FBGEMM CUDA wheel https://github.com/pytorch/FBGEMM/actions/runs/10772363019/job/29869844126 uncovers the fact that Nova builds CUDA wheels on GPU runners https://github.com/pytorch/test-infra/blob/main/tools/scripts/generate_binary_build_matrix.py#L120. This isn't the most efficient way to use these runners, but this was done this way because domains builds were quick and didn't need a separate test job. The assumption doesn't hold for FBGEMM.

Solution: Check if it's worth the effort to refactor Nova build job into build and validation parts, only the latter requires GPU while we can build on a bigger CPU runner.

(We also need this when we want to use Nova in PyTorch in the future, I'm creating this to track the issue)

cc @atalman @malfet @spcyppt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Cold Storage
Development

No branches or pull requests

1 participant