Commit dbd13c8
authored
GGUF Q6K WA for GPU (openvinotoolkit#2135)
Base on openvinotoolkit#2120
Mainly features:
1. Q6_K is not supported by some platforms GPU because of weight
compression w/ group size 16, so we use F16 as a workaround
2. Format file gguf.cpp
Tickets:
CVS-166026
CVS-166027
CVS-1667391 parent 447eb09 commit dbd13c8
2 files changed
+251
-228
lines changed
0 commit comments