This repository provides the hardware plugin that enables vLLM on RBLN NPUs, including ATOM and REBEL.
Built on top of vLLM’s Plugin System, it allows seamless integration with the vLLM ecosystem and provides high-throughput, low-latency LLM serving on RBLN hardware. Our plugin supports a wide range of popular LLMs and continues to expand to support all features enabled in vLLM, including advanced attention mechanisms.
rebel-compileroptimum-rbln
You can install this project using pip or from source.
pip install vllm-rblngit clone https://github.com/rebellions-sw/vllm-rbln.git
cd vllm-rbln
pip install -e .We welcome all contributions! Whether it's reporting issues, proposing enhancements, or improving docs—your input helps make the project better.
See our CONTRIBUTING.md for more information.
This project is licensed under the Apache License 2.0.
See the LICENSE file for more information.
- Join discussions and get answers in our Developer Community
- Contact maintainers at [email protected]