@@ -51,7 +51,7 @@ The results of the ImageNet linear probe are as follows:
5151More details about MLCD-Embodied can be found in the [ MLCD-Embodied.md] ( mlcd_vl/README.md ) file.
5252
5353
54- ### 1. General Ability Evaluation: Comparison with LLaVA OneVision-7B and GPT-4
54+ ### Comparison with LLaVA OneVision-7B and GPT-4
5555
5656| Dataset | Split | MLCD-Embodied-7B | LLaVA OneVision-7B | GPT-4v | GPT-4o |
5757| :------------- | :---: | :-----------------: | :----------------: | :------: | :----: |
@@ -68,48 +68,6 @@ More details about MLCD-Embodied can be found in the [MLCD-Embodied.md](mlcd_vl/
6868| MME | test | 578/1603 | 418/1580 | 517/1409 | - |
6969
7070
71-
72- ### 2. Usage
73-
74-
75- ``` bash
76- git clone https://github.com/deepglint/unicom
77- cd unicom
78- pip install --upgrade pip
79- pip install -e " .[train]"
80- pip install flash-attn --no-build-isolation
81-
82- CUDA_VISIBLE_DEVICES=0 python infer.py --model_dir DeepGlint-AI/MLCD-Embodied-7B
83-
84- # example:
85- # >> Enter 'exit' to end the conversation, 'reset' to clear the chat history.
86- # >> Enter image file paths (comma-separated): ./_static/images/logo.png
87- # >> User: <image>What kind of animal is it in this picture?
88- # >> Assistant: The image features a stylized representation of a cat, characterized by its vibrant and abstract depiction.
89- # >> User: What color is this cat?
90- # >> Assistant: The cat in the image is primarily white with blue, orange and pink accents, creating a visually appealing and unique appearance.
91- # >> User: <image>请你介绍一下这个图片
92- # >> Assistant: 这是一幅充满创意的猫头艺术作品。它采用了多色渐变和抽象风格,将猫的头部描绘成一个充满活力和色彩的视觉冲击。猫的眼睛用金色渲染,显得非常有神采,
93- # 而粉色的鼻子则增添了一丝可爱感。整体设计融合了现代艺术与传统猫头图案,创造出一种既独特又引人入胜的视觉效果。。
94- ```
95- ### 3. Eval
96- ```
97- pip install lmms-eval==0.2.0
98- PYTHONPATH=./ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m accelerate.commands.launch \
99- --main_process_port=12444 \
100- --num_processes=8 \
101- -m lmms_eval \
102- --model llava \
103- --model_args pretrained=DeepGlint-AI/MLCD-Embodied-7B,conv_template=qwen_1_5 \
104- --tasks mme \
105- --batch_size 1 \
106- --log_samples \
107- --log_samples_suffix mlcd \
108- --output_path ./eval_log/
109- ```
110-
111-
112-
11371## Multi-Label Cluster Discrimination (MLCD)
11472<a name =" multi-label-cluster-discrimination-mlcd " ></a >
11573[ ![ Arxiv] ( https://img.shields.io/badge/arXiv-2407.17331-red )] ( https://arxiv.org/abs/2407.17331 ) [ ![ Hugging Face] ( https://img.shields.io/badge/Hugging%20Face-Model-yellow )] ( https://huggingface.co/DeepGlint-AI/mlcd-vit-large-patch14-336 )
0 commit comments