Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.

Commit 792ac3d

Browse files
authored
Merge pull request #358 from allisonvacanti/changelog_1_14_0
Add 1.14.0 changelog.
2 parents 772eae8 + 4d83d4a commit 792ac3d

File tree

2 files changed

+56
-0
lines changed

2 files changed

+56
-0
lines changed

CHANGELOG.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,57 @@
1+
# CUB 1.14.0 (NVIDIA HPC SDK 21.9)
2+
3+
## Summary
4+
5+
CUB 1.14.0 is a major release accompanying the NVIDIA HPC SDK 21.9.
6+
7+
This release provides the often-requested merge sort algorithm, ported from the
8+
`thrust::sort` implementation. Merge sort provides more flexibility than the
9+
existing radix sort by supporting arbitrary data types and comparators, though
10+
radix sorting is still faster for supported inputs. This functionality is
11+
provided through the new `cub::DeviceMergeSort` and `cub::BlockMergeSort`
12+
algorithms.
13+
14+
The namespace wrapping mechanism has been overhauled for 1.14. The existing
15+
macros (`CUB_NS_PREFIX`/`CUB_NS_POSTFIX`) can now be replaced by a single macro,
16+
`CUB_WRAPPED_NAMESPACE`, which is set to the name of the desired wrapped
17+
namespace. Defining a similar `THRUST_CUB_WRAPPED_NAMESPACE` macro will embed
18+
both `thrust::` and `cub::` symbols in the same external namespace. The
19+
prefix/postfix macros are still supported, but now require a new
20+
`CUB_NS_QUALIFIER` macro to be defined, which provides the fully qualified CUB
21+
namespace (e.g. `::foo::cub`). See `cub/util_namespace.cuh` for details.
22+
23+
## Breaking Changes
24+
25+
- NVIDIA/cub#350: When the `CUB_NS_[PRE|POST]FIX` macros are set,
26+
`CUB_NS_QUALIFIER` must also be defined to the fully qualified CUB namespace
27+
(e.g. `#define CUB_NS_QUALIFIER ::foo::cub`). Note that this is handled
28+
automatically when using the new `[THRUST_]CUB_WRAPPED_NAMESPACE` mechanism.
29+
30+
## New Features
31+
32+
- NVIDIA/cub#322: Ported the merge sort algorithm from Thrust:
33+
`cub::BlockMergeSort` and `cub::DeviceMergeSort` are now available.
34+
- NVIDIA/cub#326: Simplify the namespace wrapper macros, and detect when
35+
Thrust's symbols are in a wrapped namespace.
36+
37+
## Bug Fixes
38+
39+
- NVIDIA/cub#160, NVIDIA/cub#163, NVIDIA/cub#352: Fixed several bugs in
40+
`cub::DeviceSpmv` and added basic tests for this algorithm. Thanks to James
41+
Wyles and Seunghwa Kang for their contributions.
42+
- NVIDIA/cub#328: Fixed error handling bug and incorrect debugging output in
43+
`cub::CachingDeviceAllocator`. Thanks to Felix Kallenborn for this
44+
contribution.
45+
- NVIDIA/cub#335: Fixed a compile error affecting clang and NVRTC. Thanks to
46+
Jiading Guo for this contribution.
47+
- NVIDIA/cub#351: Fixed some errors in the `cub::DeviceHistogram` documentation.
48+
49+
## Enhancements
50+
51+
- NVIDIA/cub#348: Add an example that demonstrates how to use dynamic shared
52+
memory with a CUB block algorithm. Thanks to Matthias Jouanneaux for this
53+
contribution.
54+
155
# CUB 1.13.1 (CUDA Toolkit 11.5)
256

357
CUB 1.13.1 is a minor release accompanying the CUDA Toolkit 11.5.

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,8 @@ See the [changelog](CHANGELOG.md) for details about specific releases.
100100
101101
| CUB Release | Included In |
102102
| ------------------------- | --------------------------------------- |
103+
| 1.14.0 | NVIDIA HPC SDK 21.9 |
104+
| 1.13.1 | CUDA Toolkit 11.5 |
103105
| 1.13.0 | NVIDIA HPC SDK 21.7 |
104106
| 1.12.1 | CUDA Toolkit 11.4 |
105107
| 1.12.0 | NVIDIA HPC SDK 21.3 |

0 commit comments

Comments
 (0)