[New Bitnet Model Support Request] Deepgrove model Bonsai 0.5B - Add Channel Scales

A new SOTA bitnet model, Bonsai 0.5B, has come out. Seems to outperform larger bitnet models like Falcon 1B, 3B, TriLM 700M. Seems like they are going to release a new line of bitnet models which is really exciting. 

Support is needed for these models. They adopt a channel wise scaling factor compared to the tensor level ones. Maybe a separate kennel can be built to apply scales outside of the matmul kernels? Probably would yield similar inference speeds. Note that the hugging face does have a custom Q-linear layer that applies the scales.

HF: https://huggingface.co/deepgrove/Bonsai

Seems super promising.


pinging @Eddie-Wang1120 + other kernels writers 


Other posts and information:

https://www.reddit.com/r/LocalLLaMA/comments/1jgkqio/new_bitnet_model_from_deepgrove/
https://x.com/deepgrove_ai/status/1903103798735761518

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New Bitnet Model Support Request] Deepgrove model Bonsai 0.5B - Add Channel Scales #169

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[New Bitnet Model Support Request] Deepgrove model Bonsai 0.5B - Add Channel Scales #169

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions