Is it possible to support nvfp4 in blackwell ? #1543
edisonchan
started this conversation in
General
Replies: 1 comment
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
https://docs.nvidia.com/deeplearning/cudnn/frontend/latest/operations/BlockScaling.html
https://docs.nvidia.com/cuda/cuda-math-api/cuda_math_api/group__CUDA__MATH__FP4__MISC.html
"The NVFP4 recipe quantizes across 16 FP32 elements along the rows to produce 16 FP4 output values (E2M1) and 1 FP8 scaling factor (E4M3)."
Beta Was this translation helpful? Give feedback.
All reactions