From eedb3a337b46a2bc794c08bdc80fdddb6d16c92b Mon Sep 17 00:00:00 2001
From: ZhangShaolei <2512857469@qq.com>
Date: Fri, 10 Jan 2025 13:09:27 +0800
Subject: [PATCH] Add LLaVA-Mini

LLaVA-Mini is a unified large multimodal model that can support the understanding of images, high-resolution images, and videos in an efficient manner.
Paper: https://arxiv.org/abs/2501.03895
Code & Demo: https://github.com/ictnlp/LLaVA-Mini
---
 README.md | 1 +
 1 file changed, 1 insertion(+)
diff --git a/README.md b/README.md
index 2b6d237b..ae3f0ea2 100644
--- a/README.md
+++ b/README.md
@@ -94,6 +94,7 @@ A speech-to-speech dialogue model with both low-latency and high intelligence wh
 ## Multimodal Instruction Tuning
 |  Title  |   Venue  |   Date   |   Code   |   Demo   |
 |:--------|:--------:|:--------:|:--------:|:--------:|
+| ![Star](https://img.shields.io/github/stars/ICTNLP/LLaVA-Mini.svg?style=social&label=Star) <br> [**LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token**](https://arxiv.org/pdf/2501.03895) <br> | arXiv | 2025-01-03 | [Github](https://github.com/ictnlp/LLaVA-Mini) | Local Demo |
 | ![Star](https://img.shields.io/github/stars/VITA-MLLM/VITA.svg?style=social&label=Star) <br> [**VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction**](https://arxiv.org/pdf/2501.01957) <br> | arXiv | 2025-01-03 | [Github](https://github.com/VITA-MLLM/VITA) | - |
 | ![Star](https://img.shields.io/github/stars/QwenLM/Qwen2-VL.svg?style=social&label=Star) <br> [**QVQ: To See the World with Wisdom**](https://qwenlm.github.io/blog/qvq-72b-preview/) <br> | Qwen | 2024-12-25 | [Github](https://github.com/QwenLM/Qwen2-VL) | [Demo](https://qwenlm.github.io/blog/qvq-72b-preview/) |
 | ![Star](https://img.shields.io/github/stars/deepseek-ai/DeepSeek-VL2.svg?style=social&label=Star) <br> [**DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding**](https://arxiv.org/pdf/2412.10302) <br> | arXiv | 2024-12-13 | [Github](https://github.com/deepseek-ai/DeepSeek-VL2) | - |