Skip to content

[llama32_1b] full-int4 ELF2 for decode (-48% latency, -68% weight mem… #6497

[llama32_1b] full-int4 ELF2 for decode (-48% latency, -68% weight mem…

[llama32_1b] full-int4 ELF2 for decode (-48% latency, -68% weight mem… #6497