-
Notifications
You must be signed in to change notification settings - Fork 587
MAGMA: update to v2.9.0 #11237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAGMA: update to v2.9.0 #11237
Conversation
Small note on the merge conflict: since it's just whitespace, I don't want to trigger another 12h CI just for that, so I'll wait just for reviewers. |
Why is ccache not working? |
# This flag reduces the size of the compiled binaries; if | ||
# they become over 2GB (e.g. due to targeting too many | ||
# compute_XX), linking fails. | ||
# See: https://github.com/NixOS/nixpkgs/pull/220402 | ||
export NVCC_PREPEND_FLAGS+=' -Xfatbin=-compress-all' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh, that sounds very interesting!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the builds in 5614799 were failing after I expanded the gencode targets in 72a7f0c to better support the compute capabilities in several popular GPU cards (see e.g. here).
Then I found out that NixOS uses that flag for all CUDA-related builds, especially for MAGMA itself, and indeed it works pretty well 😃
Despite increasing the compiled targets, which would have increased the library size, the decompressed artifacts sizes were actually lower than before.
Looks good to me, and it's great to see more and more packages being able to do cross-compilation of CUDA code (even if it's not as straightforward as I wished)! |
You mean resolving the merge conflict in |
No, I was referring to the fact that the rebuilds take over one hour, despite the fact we use ccache to reduce the time for rebuilds. For reference, a rebuild of llvm with warm cache should take here ~5-10 minutes (the strictly build-only time, then the auditor takes a lot longer), and that's a pretty large project, I'm surprised something else takes over one hour to rebuild. |
Oh, I see; haven't got any clue as well. In any case, thanks for merging! 🙏 |
I've done a little restructuring in addition to the version update. The changes enable
aarch64
builds and optimal compatibility across a wide range of datacenter and consumer GPUs.