Skip to content

mario-koddenbrock/clip_demo

Repository files navigation

CLIP Webcam Demo

A simple real-time webcam demo using OpenAI's CLIP model for zero-shot classification.

🖼️ Captures live frames from your webcam
🧾 You enter a list of class names (e.g. "cat", "mug", "face")
🤖 It shows the predicted class probabilities on top of each frame

CLIP Output


📦 Requirements

  • Python 3.8+
  • torch
  • transformers
  • Pillow
  • opencv-python

Install dependencies with:

pip install -r requirements.txt

🚀 Run the Demo

python webcam_clip.py --classes "cat" "mug" "person"

Press q to quit the webcam window.


🧠 How It Works

  1. Loads a pretrained CLIP model (ViT-B/32).
  2. Encodes input class names to CLIP text features.
  3. Captures webcam frames, encodes them to image features.
  4. Computes similarity between image and text features.
  5. Displays the frame with class probabilities on-screen.

🔍 Example

python webcam_clip.py --classes "banana" "keyboard" "mouse"

You’ll see the probabilities for each class live on-screen.


📚 References


Created for teaching purposes by Mario Koddenbrock

About

A minimal and educational demo showcasing how to use OpenAI's CLIP model for image-text similarity, zero-shot classification, and image search.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages