Skip to content

YOLO-Master-v0.0

Pre-release
Pre-release

Choose a tag to compare

@isLinXu isLinXu released this 31 Dec 13:00
· 108 commits to main since this release

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection.

We are excited to announce the first official release of YOLO-Master, a novel YOLO architecture integrated with Mixture-of-Experts (MoE). This release brings significant improvements in accuracy-latency trade-offs, specifically targeting real-time object detection and segmentation tasks.

"Adaptive Intelligence for Every Scene" — YOLO-Master introduces instance-conditional computation, dynamically allocating resources where they are needed most.

🚀 Key Highlights

  • MoE Architecture Integration: Native support for Mixture-of-Experts with dynamic routing.
  • SOTA Performance: Achieves 42.4% mAP on COCO with just 1.62ms latency (N-scale), outperforming YOLOv10/v11/v12.
  • Segmentation Breakthrough: +2.8% mAPmask gain over YOLOv12-seg-N.
  • Hardware-Aware Optimization: Optimized for GPU (Batched Compute) and Mobile (Ghost Experts).

🛠 New Features in v1.0

1. Advanced MoE Modules

  • ModularRouterExpertMoE (Recommended): A highly stable, plug-and-play MoE block featuring:
    • Shared Experts: Ensures baseline performance and prevents training collapse.
    • Z-Loss Integration: Stabilizes router logits for smoother convergence.
  • UltraOptimizedMoE: Designed for extreme speed, featuring Batched Expert Computation which eliminates Python loops, delivering 3-5x inference speedup on GPUs.
  • GhostExpert: Parameter-efficient experts based on GhostNet, reducing memory bandwidth pressure for mobile/edge deployment.

2. Intelligent Routing

  • EfficientSpatialRouter: Reduces routing FLOPs by >90% via spatial pre-pooling.
  • DynamicRouting: Adaptive computational resource allocation based on scene complexity.

3. Stability & Ease of Use

  • Training Stability: Solved common MoE training instability with Shared Expert paths and specialized router initialization (std=0.01).
  • Deployment Ready: Full support for ONNX and TensorRT export.

📊 Benchmarks (COCO val2017)

Model Size mAP (box) Latency Comparison
YOLOv10-N 640 38.5 1.84ms -
YOLOv11-N 640 39.4 1.50ms -
YOLOv12-N 640 40.6 1.64ms -
YOLO-Master-N 640 42.4 1.62ms SOTA 🏆

🔗 Resources

📥 Quick Start

Installation

git clone https://github.com/isLinXu/YOLO-Master.git
cd YOLO-Master
pip install -r requirements.txt
pip install -e .

Training (Single GPU)

from ultralytics import YOLO

model = YOLO('cfg/models/master/v0/det/yolo-master-n.yaml')
results = model.train(data='coco.yaml', epochs=100, imgsz=640)

🤝 Contributors

Special thanks to the research team at Tencent Youtu Lab and Singapore Management University.