AMD-Hybrid-Models

🔍 Overview: Efficient Hybrid Language Models on AMD GPUs

Official Repository for X-EcoMLA and Zebra-Llama

Welcome! This repo hosts two complementary projects that focus on memory-efficient and high-performance large language models (LLMs). Large Language Models (LLMs) often face major memory bottlenecks due to large key-value (KV) caches during inference. This repository introduces two solutions:

Folder	Description
`x-eco-mla/`	Implements X-EcoMLA: a method for upcycling attention into Multi-head Latent Attention (MLA) for extreme KV cache compression.
`zebra-llama/`	Implements Zebra-Llama: a family of hybrid MLA + Mamba2 models with minimal retraining and maximum efficiency.

Citation

If you find this repository useful in your research or application, please cite our paper:

@article{li2025x_ecomla,
  title={{X-EcoMLA}: Upcycling Pre-Trained Attention into {MLA} for Efficient and Extreme {KV} Compression},
  author={Li, Guihong and Rezagholizadeh, Mehdi and Yang, Mingyu and Appia, Vikram and Barsoum, Emad},
  journal={arXiv preprint arXiv:2503.11132},
  year={2025},
  url={https://arxiv.org/abs/2503.11132}
}

@article{yang2025zebra,
  title={Zebra-Llama: Towards Extremely Efficient Hybrid Models},
  author={Yang, Mingyu and Rezagholizadeh, Mehdi and Li, Guihong and Appia, Vikram and Barsoum, Emad},
  journal={arXiv preprint arXiv:2505.17272},
  year={2025}
}

🤝 Contributing

We welcome contributions! Please open an issue to discuss questions and major changes.

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
X-EcoMLA		X-EcoMLA
Zebra-Llama		Zebra-Llama
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AMD-Hybrid-Models

🔍 Overview: Efficient Hybrid Language Models on AMD GPUs

Citation

🤝 Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

AMD-AGI/AMD-Hybrid-Models

Folders and files

Latest commit

History

Repository files navigation

AMD-Hybrid-Models

🔍 Overview: Efficient Hybrid Language Models on AMD GPUs

Citation

🤝 Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages