v1.5.0-rc1
Pre-release
Pre-release
New Features and Enhancements
Core
- Added support for ucc_mem_map and ucc_mem_unmap {PR #1070}
- Enhanced error logs in context creation {PR #1135}
- Enhanced error log in collective init {PR #1104}
- Added ucc net devices config {PR #1141}
CL/HIER
TL/UCP
- Fixed allreduce knomial data consistency {PR #1145}
- Fixed ag oneshot {PR #1134}
- Added Allgather linear implementation {PR #1122}
- Fall back if memh not passed {PR #1136}
TL/MLX5
- Added HCA-assisted copy & CUDA scratch design {PR #1154}
- Added logging for mcast FORCE/TRY modes {PR #1156}
- Fixed segfault in multicast team creation {PR #1150}
- Recover from ipoib issue in mcast init {PR #1140}
- Added configuration to set IB QP SL {PR #1057}
- Added ctx global status check {PR #1113}
- Added cuda support for zcopy mcast {PR #1118}
TL/CUDA
- Added NVLink SHARP (NVLS) Allreduce {PR #1148}
- Added Topology Cache {PR #1137}
- Added NVLink SHARP (NVLS) Reduce Scatter {PR #1144}
EC/ROCM
- Include stdbool.h for new versions of ROCM {PR #1146}
TOPO
- Node ldr ordered by team {PR #1129}
Build and Test
- Fixed coverity issues {PR #1152}
- Updated cuda arch {PR #1143}
- Changed to CUDA 12.9 {PR #1155}
- Added buffers for onesided tests {PR #1100}
- Added perftest generator {PR #1147}
- Added missing progress calls in UCC_PERFTEST {PR #1151}
- Updated versions in CI {PR #1115}
- Bumped version to v1.5 {PR #1121}
Documentation
- Updated component image 1.4.4 {PR #1153}