v1.1.0
The highlight of this release is that Aluminum now has a logo.
There were some other, slightly less interesting, changes, too. Notably full support for multi-threaded communication in Aluminum. There are also significant improvements to support on HIP/ROCm platforms and extensive internal cleanups.
- Aluminum has a logo now!
- Support the
AL_MPI_SERIALIZEDcompile-time flag, which will run blocking MPI calls on the progress engine for situations where all calls need to come from the same thread. - Support
AL_THREAD_MULTIPLEfor support in Aluminum for safe multi-threaded communication. - Significant improvements in benchmarking/testing infrastructure.
- Removed support for custom MPI allreduce algorithms. Aluminum now uses the native MPI implementations.
- Added an
al_infobinary to provide basic info on Aluminum. - Better progress engine binding on HIP/ROCm systems.
- The host-transfer backend uses stream memory operations on HIP/ROCm systems when available.
- Aluminum no longer relies on hipify for HIP/ROCm systems.
- Aluminum's CMake exports components to identify backend support at build time.
- Significant internal code reorganizations/cleanup for CUDA stuff and the progress engine.
- Various bugfixes and other minor improvements.