Skip to content

Feature Request: vlut.cpp #1095

@HaisongDing

Description

@HaisongDing

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

This paper claims that it can achieve a 3 times speedup compared with TQ2_0 llama.cpp on CPU for bitnet using a LUT-based inference engine. Can you have a look and see if it can be integreted into the ik_llama,cpp engine.

Paper: https://arxiv.org/pdf/2512.06443
Code: https://github.com/Cipherxzc/vlut.cpp

Motivation

This paper claims that it can achieve a 3 times speedup compared with TQ2_0 llama.cpp on CPU for bitnet using a LUT-based inference engine.

Possible Implementation

Code: https://github.com/Cipherxzc/vlut.cpp

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions