Commit aff7c10
authored
Add fast CUDA MMQ GGUF kernels (huggingface#3465)
* Add fast CUDA MMQ GGUF kernels
* Adjust tolerance1 parent b503458 commit aff7c10
21 files changed
Lines changed: 9197 additions & 5 deletions
File tree
- candle-core
- src/quantized
- tests
- candle-kernels
- src
- mmq_gguf
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
721 | 721 | | |
722 | 722 | | |
723 | 723 | | |
724 | | - | |
| 724 | + | |
725 | 725 | | |
726 | 726 | | |
727 | 727 | | |
728 | 728 | | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
729 | 732 | | |
730 | 733 | | |
731 | 734 | | |
| |||
0 commit comments