Skip to content

Adding Mamba SSM Arm SVE Kernel#664

Open
hrushitfujitsu wants to merge 1 commit intohuggingface:mainfrom
MonakaResearch:kernel/mamba
Open

Adding Mamba SSM Arm SVE Kernel#664
hrushitfujitsu wants to merge 1 commit intohuggingface:mainfrom
MonakaResearch:kernel/mamba

Conversation

@hrushitfujitsu
Copy link
Copy Markdown

@hrushitfujitsu hrushitfujitsu commented Apr 20, 2026

Summary

With reference to the following PR huggingface/transformers#38185, and as per our knowledge since the kernels raised to this repo are automatically built and uploaded to the hub, we would like to raise this kernel to kernels-community
This kernel was successfully built and tested on G3E, this correction does not have any effect on the accuracy

Implementation

The new kernel vectorizes the selective scan computation using ARM SVE intrinsics. The implementation is intended to:

  • improve throughput on SVE-capable ARM CPUs
  • keep numerical behavior aligned with the existing implementation

Performance Check

We also integrated this kernel to transformers repo(https://github.com/huggingface/transformers) locally, this kernel is 3-4x faster than the current implementation
Task 32 input tokens, 1 Generated token

Batch Size SVE OSS
32 16.18 53.42
64 32.24 96.36
128 62.04 219.32
256 119.86 407.42
512 235.53 843.59
1024 466.49 1711.95

The above table represents the overall generation time (in seconds), this benchmarking was also done on G3E(64 cores)
Co-authored by @hrushitfujitsu and @abhijain1204fujitsu

@danieldk
Copy link
Copy Markdown
Member

Thanks a lot for making a kernel and contributing to the kernels ecosystem 🤗! This kernel looks like awesome work.

The scope of the kernels-community repo is very narrow. It is for two types of kernels:

  • Kernels developed by Hugging Face and partners.
  • Kernels developed by third parties, that have not been 'kernelized' yet, but are used by Hugging Face projects such as diffusers and transformers.

We encourage kernel developers to make kernels available through their own GitHub repositories and upload them to their own organization or user on the Hugging Face Hub. In that way, you also get all the credits for your work as a kernel developer.

If this kernel should be integrated into the mamba-ssm kernel through a partnership with Hugging Face, it would be best if we discuss first through a shared Slack channel (I am not sure if we have one, but I couldn't find it).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants