Adding OWLViT/OWLV2 as options for the visual grounding part

## 🚀 Feature

Currently, the project uses `GroundingDINO` as the visual grounding model which is the best performing model for some benchmark datasets
![current benchmarks for zero-shot object detection](https://github.com/luca-medeiros/lang-segment-anything/assets/51631078/2caca5d0-4ef6-4789-b812-6d8f3d212313)
We can provide the user flexibility to choose between different visual grounding models like
- [OFA](https://github.com/OFA-Sys/OFA)
- [OWLViT](https://huggingface.co/docs/transformers/en/model_doc/owlvit)
- [OWLV2](https://huggingface.co/docs/transformers/en/model_doc/owlv2)

## Motivation & Examples

Tell us why the feature is useful.
Since this project is about text guided segmentation, adding the ability to choose the technique for visual grounding pipeline seems like a natural addition.

Describe what the feature would look like, if it is implemented.
Best demonstrated using **code examples** in addition to words.

```
from PIL import Image
from lang_sam import LangSAM

# Initialize and select visual grounding model if desired. Default will be 'groundingdino'. Other options are 'ofa', 'owlvit', and 'owlv2'
model = LangSAM(model = 'groundingdino') 
image_pil = Image.open("./assets/car.jpeg").convert("RGB")
text_prompt = "wheel"
masks, boxes, phrases, logits = model.predict(image_pil, text_prompt)
```

## Note

We only consider adding new features if they are relevant to this library.
Consider if this new feature deserves to be here or should be a new library.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding OWLViT/OWLV2 as options for the visual grounding part #55

🚀 Feature

Motivation & Examples

Note

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Adding OWLViT/OWLV2 as options for the visual grounding part #55

Description

🚀 Feature

Motivation & Examples

Note

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions