Commit e7dee22
authored
[Quant] update fp8 quant kernel (vllm-project#147)
* update fp8 quant kernel
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
* ensure vectorization appliable
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
* add ut for static quant fp8
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
* remove useless val
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
* remove wrong comments
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
* update ut
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
* thanks copilot
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>
---------
Signed-off-by: Zhu, Zufang <zufang.zhu@intel.com>1 parent 0ff40c6 commit e7dee22
9 files changed
Lines changed: 603 additions & 84 deletions
File tree
- csrc
- quantization/fp8
- tests
- ops
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
78 | 78 | | |
79 | 79 | | |
80 | 80 | | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
78 | | - | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
79 | 82 | | |
80 | 83 | | |
81 | 84 | | |
| |||
0 commit comments