Open
Description
Recently, we often see this issue based on the
https://github.com/huggingface/huggingface-llama-recipes/blob/main/local_inference/fp8-405B.ipynb
We easily get illegal memory access for model like 8B quantized (shallow layers) with FBGEMM
Metadata
Metadata
Assignees
Labels
No labels
Activity