| Author | Petr Babkin |
| Advisor | Oleg Bakhteev, PhD |
The Mixture-of-Experts (MoE) layer, a sparsely activated neural architecture controlled by a routing mechanism, has recently achieved remarkable success across large-scale deep learning tasks. In parallel, Neural Architecture Search (NAS) has emerged as a powerful methodology for automatically discovering high-performing neural network. However, the application of NAS methods to MoE architectures remains an underexplored research area. In this work, we propose an architecture search framework for MoE models, which explicitly leverages the underlying cluster structure of the data. We evaluate the proposed approach on computer vision benchmarks and demonstrate that it outperforms baseline MoE architectures trained on the same datasets in terms of accuracy and computational efficiency.
If you find our work helpful, please cite us.
@article{babkin2025structure,
title={Title},
author={Petr Babkin, Oleg Bakhteev},
year={2025}
}Our project is MIT licensed. See LICENSE for details.