MLLM Auto-Labeling for Images & Videos

🌐 View Project Website | 📚 Documentation | 💬 Issues

Leverage Multimodal Large Language Models (MLLMs) to automatically label your image and video datasets.

Generate high-quality bounding box annotations using state-of-the-art vision-language models like GPT-4V, Claude 3.5 Sonnet, and Qwen-VL. Save 80%+ annotation time while maintaining 85-95% accuracy.

Integrate with Label Studio for human review and collaborative annotation workflows.

📁 Project Structure

video-autolabeling-pipeline/
├── README.md              # Main documentation
├── LICENSE                # Open source license
├── requirements.txt       # Python dependencies
├── docs/                  # 📚 Documentation
│   ├── 快速开始.md        # Quick start guide (Chinese)
│   ├── QWEN_GUIDE.md      # Qwen-VL detailed guide
│   ├── AUTO_LABELING_GUIDE.md  # VLM auto-labeling guide
│   └── YOLO_GUIDE.md      # YOLO local labeling guide
├── scripts/               # 🔧 Core scripts
│   ├── image_auto_labeling.py     # Image auto-labeling
│   ├── video_auto_labeling.py     # Video auto-labeling
│   ├── yolo_auto_labeling.py      # YOLO labeling
│   ├── quick_yolo_label.sh        # YOLO quick labeling script
│   ├── visualize_result.py        # Visualize labeling results
│   ├── test_qwen_api.py           # Test Qwen API
│   └── start_label_studio.sh      # Start Label Studio
├── templates/             # 📋 Labeling templates
├── data/                  # 📹 Data files (examples)
└── labels/                # 🏷️ Labeling results (output)

🎓 Getting Started

First time user? → Check QUICKSTART.md for complete tutorial (10 mins setup)

🚀 Quick Start

Test with an image (fastest):

# 1. Set API Key (choose one)
export DASHSCOPE_API_KEY="your-qwen-key"        # Qwen (recommended for China)
export ANTHROPIC_API_KEY="your-claude-key"     # Claude (recommended for international)

# 2. Label an image
python3 scripts/image_auto_labeling.py your-image.jpg --provider qwen --visualize

# View the labeled result with bounding boxes instantly!

💡 For detailed steps, see QUICKSTART.md or 快速开始.md (Chinese)

✨ Key Features

Feature	Description
🤖 MLLM Auto-Labeling	Leverage GPT-4V, Claude 3.5 Sonnet, Qwen-VL to auto-generate labels
📹 Video Frame Labeling	Smart sampling strategies for efficient video annotation
🖼️ Image Object Detection	Single-shot bounding box generation for images
⚡ Save 80%+ Time	AI generates initial labels, humans only review and refine
🎯 High Accuracy	Claude: 90-95%, Qwen: 85-90%, YOLO: 80-85%
🌐 China-Friendly	Qwen-VL support, no VPN required
🔧 Production Ready	Label Studio integration, batch processing, visualization tools

📚 Documentation

For Beginners:

QUICKSTART.md 🔰 Start Here - Complete tutorial, 10-min setup
快速开始.md ⭐ Quick Start - Three ways to get started (Chinese)

Advanced Guides:

QWEN_GUIDE.md - Qwen-VL detailed guide (recommended for users in China)
AUTO_LABELING_GUIDE.md - VLM auto-labeling with GPT-4V, Claude, etc.
YOLO_GUIDE.md - YOLO local labeling (free, offline)

🎯 Use Cases

🚗 Autonomous Driving: Vehicle, pedestrian, traffic sign detection
🏭 Industrial QA: Defect detection, product classification
🏥 Medical Imaging: Lesion annotation, organ segmentation
📦 E-commerce: Product recognition, shelf monitoring
🎥 Video Analytics: Action recognition, object tracking

💡 Contributing

We welcome Issues and Pull Requests! If you have questions or suggestions, please contact us on GitHub.

📄 License

This project is licensed under the LICENSE file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLLM Auto-Labeling for Images & Videos

📁 Project Structure

🎓 Getting Started

🚀 Quick Start

✨ Key Features

📚 Documentation

🎯 Use Cases

💡 Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
docs		docs
labels		labels
scripts		scripts
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
batch_process_10videos.sh		batch_process_10videos.sh
requirements.txt		requirements.txt

License

yizhengyuan/video-autolabeling-pipeline

Folders and files

Latest commit

History

Repository files navigation

MLLM Auto-Labeling for Images & Videos

📁 Project Structure

🎓 Getting Started

🚀 Quick Start

✨ Key Features

📚 Documentation

🎯 Use Cases

💡 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages