CLIP for Visual Servoing

CLIP

CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. It can be instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing for the task, similarly to the zero-shot capabilities of GPT-2 and 3. We found CLIP matches the performance of the original ResNet50 on ImageNet “zero-shot” without using any of the original 1.28M labeled examples, overcoming several major challenges in computer vision.

Install

First, install PyTorch and torchvision, as well as small additional dependencies, and then install this repo as a Python package. On a CUDA GPU machine (CUDA 12.1), the following will do the trick:

$ pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
$ pip3 install ftfy regex tqdm
$ pip3 install git+https://github.com/openai/CLIP.git

Replace pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 with the appropriate CUDA version on your machine or cpuonly when installing on a machine without a GPU.

Pybullet

Install

It is highly recommended to use PyBullet Python bindings for improved support for robotics, reinforcement learning and VR. Use pip install pybullet and checkout the PyBullet Quickstart Guide.

Installation is simple:

pip3 install pybullet --upgrade --user
python3 -m pybullet_envs.examples.enjoy_TF_AntBulletEnv_v0_2017may
python3 -m pybullet_envs.examples.enjoy_TF_HumanoidFlagrunHarderBulletEnv_v1_2017jul
python3 -m pybullet_envs.deep_mimic.testrl --arg_file run_humanoid3d_backflip_args.txt

If you use PyBullet in your research, please cite it like this:

@MISC{coumans2021,
author =   {Erwin Coumans and Yunfei Bai},
title =    {PyBullet, a Python module for physics simulation for games, robotics and machine learning},
howpublished = {\url{http://pybullet.org}},
year = {2016--2021}
}

Run Code

$ git clone https://github.com/sttxd1/CLIP_for_Visual_Servoing_ECE276C.git -b main
$ cd CLIP_for_Visual_Servoing_ECE276C
$ python3 main.py

there will be instructions tell you things you need to enter:

Enter the object type (e.g., 'red cube', 'blue ball'): example: red cube
Enter 0 for static, 1 for dynamic: example: 0

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
clip		clip
clip_test		clip_test
control		control
data		data
servo		servo
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
hubconf.py		hubconf.py
main.py		main.py
model-card.md		model-card.md
plot.py		plot.py
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CLIP for Visual Servoing

CLIP

Install

Pybullet

Install

Run Code

Demo

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

sttxd1/CLIP_for_Visual_Servoing_ECE276C

Folders and files

Latest commit

History

Repository files navigation

CLIP for Visual Servoing

CLIP

Install

Pybullet

Install

Run Code

Demo

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages