You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* v0.7.2
[Enhancement] Fixed issues in the code which caused warnings in MSVC and clang compilers.
[Enhancement] Fixed errors in `get_heuristics_list` where for certain heuristics mode in older cuDNN versions, the heuristics list might be incorrect.
[Bug fixes] Fixed several test cases failing on unsupported GPUs to exit gracefully.
[Samples] Added a sample to showcase fp8 convolution forward in Nvidia Hopper GPUs. The sample also showcases post convolution book-keeping operations such as scaling and absolute maximum reduction.
[Samples] Added a sample which converts fp16 tensor to fp8 and performs transpose and absolute maximum reduction.
[Samples] Added a sample to demonstrate Max pooling operation including tensor index dump, necessary to speed up the backward pass.
[Samples] Added a sample to showcase the backward pooling operation.
* v0.7.3 release
[Enhancement] Added `CUDNN_FRONTEND_VERSION` macro to track cudnn frontend version.
[Enhancement] Added the `inline` keyword to the get_plan functions to enable inclusion in multiple compilation units.
[Bug fix] Replace CUDNN with CUDNN_VERSION as the right macro names.
Co-authored-by: Anerudhan Gopal <[email protected]>
0 commit comments