Open
Description
i.e. sm_90a
In general, PTX code generated for one target architecture can be run on future architectures (i.e., it is forward compatible). However, CUDA 12.0 introduced the concept of "architecture-accelerated features" whose PTX does not have forward compatibility guarantees. Several Hopper PTX instructions fall under this category of architecture-accelerated features, and thus require a sm_90a target architecture (note the "a" appended). For more details on this and other architecture-accelerated instructions, please refer to the CUDA Documentation.