Skip to content

Commit 6a4c850

Browse files
authored
Update README.md (#144)
1 parent 44abf16 commit 6a4c850

File tree

1 file changed

+77
-17
lines changed

1 file changed

+77
-17
lines changed

mlcd_vl/downstream/README.md

Lines changed: 77 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,17 @@
1-
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-label-cluster-discrimination-for-visual/referring-expression-segmentation-on-refcocog)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcocog?p=multi-label-cluster-discrimination-for-visual)
1+
<a href="https://arxiv.org/pdf/2407.17331"><img src="https://img.shields.io/badge/arXiv-2407.17331-b31b1b" alt="arXiv"></a>
2+
<a href='https://huggingface.co/DeepGlint-AI/MLCD-Seg'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-green'></a>
3+
</div>
4+
5+
6+
7+
## Example:
8+
9+
![output](https://github.com/user-attachments/assets/85c023a1-3e0c-4ea5-a764-1eb9ee0fbddf)
10+
11+
12+
## RefCOCO Segmentation Evaluation Results:
13+
14+
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-label-cluster-discrimination-for-visual/referring-expression-segmentation-on-refcocog)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcocog?p=multi-label-cluster-discrimination-for-visual)
215
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-label-cluster-discrimination-for-visual/referring-expression-segmentation-on-refcoco-5)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-5?p=multi-label-cluster-discrimination-for-visual)
316
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-label-cluster-discrimination-for-visual/referring-expression-segmentation-on-refcoco-3)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-3?p=multi-label-cluster-discrimination-for-visual)
417
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-label-cluster-discrimination-for-visual/referring-expression-segmentation-on-refcocog-1)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcocog-1?p=multi-label-cluster-discrimination-for-visual)
@@ -9,13 +22,6 @@
922
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-label-cluster-discrimination-for-visual/referring-expression-segmentation-on-refcoco)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco?p=multi-label-cluster-discrimination-for-visual)
1023

1124

12-
# MLCD-Seg
13-
[![Hugging Face](https://img.shields.io/badge/Hugging%20Face-MLCD_SEG_Model-yellow)](https://huggingface.co/DeepGlint-AI/MLCD-Seg-7B)
14-
15-
This repository is dedicated to researching the application of multimodal large models in downstream tasks through an end-to-end approach. At present, the segmentation part has achieved excellent results in the reference segmentation project
16-
17-
18-
## RefCOCO Segmentation Evaluation:
1925

2026
| Dataset | Split | MLCD-seg-7B | EVF-SAM | GLaMM | VisionLLM v2| LISA |
2127
| :-- | :-: | :-: | :-: | :-: | :-: | :-: |
@@ -28,20 +34,74 @@ This repository is dedicated to researching the application of multimodal large
2834
| RefCOCOg | val | **79.7** | 78.2 | 74.2 | 73.3 | 67.9 |
2935
| RefCOCOg | test | **80.5** | 78.3 | 74.9 | 74.8 | 70.6 |
3036

31-
---
32-
## Evaluation
33-
Install the evaluation tool and execute the evaluation script:
37+
## How to use:
38+
39+
If you just want to use this code, please refer to this sample below
40+
```python
41+
model_path = "DeepGlint-AI/MLCD-Seg" # or use your local path
42+
mlcd_seg = AutoModel.from_pretrained(
43+
model_path,
44+
torch_dtype=torch.float16,
45+
trust_remote_code=True
46+
).cuda()
47+
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
48+
# Assuming you have an image named test.jpg
49+
seg_img = Image.open("asserts/example.jpg").convert('RGB')
50+
seg_prompt = "Could you provide a segmentation mask for the right giraffe in this image?"
51+
pred_mask = model.seg(seg_img, seg_prompt, tokenizer, force_seg=False)
52+
```
53+
54+
If you want to use this code measurement dataset (e.g. refcoco), then you need to use the following method
55+
```python
56+
model_path = "DeepGlint-AI/MLCD-Seg" # or use your local path
57+
mlcd_seg = AutoModel.from_pretrained(
58+
model_path,
59+
torch_dtype=torch.float16,
60+
trust_remote_code=True
61+
).cuda()
62+
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
63+
# Assuming you have an image named test.jpg
64+
seg_img = Image.open("asserts/example.jpg").convert('RGB')
65+
seg_prompt = "Could you provide a segmentation mask for the right giraffe in this image?"
66+
pred_mask = model.seg(seg_img, seg_prompt, tokenizer, force_seg=True)
67+
```
68+
69+
## Intstallation
70+
3471
```bash
35-
bash ./eval/scripts/eval_refcoco.sh
72+
# Create environment from file
73+
conda create -n mlcd_seg python=3.10
74+
conda activate mlcd_seg
75+
76+
pip install -r requirements.txt
3677
```
37-
---
78+
79+
80+
## Docker
81+
```bash
82+
# PyTorch Docker
83+
84+
```bash
85+
# Build the Docker image
86+
docker build -t mlcd_seg .
87+
88+
# Run the Docker container with GPU support
89+
docker run -it --rm --gpus all mlcd_seg bash
90+
```
91+
3892

3993
## Citations
4094
```
4195
@misc{mlcdseg_wukun,
42-
author = {Wu, Kun and Xie, Yin and Zhou, Xinyu and An, Xiang, and Deng, Jiankang},
43-
title = {MLCD-seg-7B},
44-
year = {2024},
45-
url = {https://github.com/deepglint/unicom/tree/main/downstream},
96+
author = {Wu, Kun and Xie, Yin and Jie, Yu and Zhou, Xinyu and An, Xiang, Feng, Ziyong and Deng, Jiankang},
97+
title = {MLCD-Seg},
98+
year = {2025},
99+
url = {https://github.com/deepglint/MLCD_SEG},
100+
}
101+
@inproceedings{anxiang_2024_mlcd,
102+
title={Multi-label Cluster Discrimination for Visual Representation Learning},
103+
author={An, Xiang and Yang, Kaicheng and Dai, Xiangzi and Feng, Ziyong and Deng, Jiankang},
104+
booktitle={ECCV},
105+
year={2024}
46106
}
47107
```

0 commit comments

Comments
 (0)