This project aims to develop a robust, real-time hand gesture control system. Starting from a rapid prototype that converts MediaPipe models to OpenVINO IR, this repository will evolve to support multiple gestures, smooth tracking, and seamless application control without a mouse or keyboard.
.
โโโ models/ # OpenVINO IR models (XML + BIN files)
โ โโโ hand_detector.{xml,bin}
โ โโโ hand_landmarks_detector.{xml,bin}
โ โโโ gesture_embedder.{xml,bin}
โ โโโ canned_gesture_classifier.{xml,bin}
โโโ gesturePipeline.ipynb # Initial implementation notebook
โโโ requirement.txt # Python dependencies
โโโ README.md # Project documentation
Model | Description |
---|---|
hand_detector | Detects presence and bounding boxes of hands |
hand_landmarks_detector | Locates 21 hand landmarks |
gesture_embedder | Encodes landmark vectors into feature space |
canned_gesture_classifier | Classifies gestures (e.g., open palm, fist...) |
All models are in OpenVINO IR format (.xml
and .bin
), optimized for Intel hardware inference.
Install dependencies:
pip install -r requirement.txt
To run the prototype pipeline:
jupyter notebook gesturePipeline.ipynb
- โ OpenVINO models successfully converted
- โ Basic pipeline structure built in notebook
โ ๏ธ Can currently detect upto 5 gesturesโ ๏ธ Detection is unstable and lasts only a few seconds- ๐ซ Gesture-to-action mapping not implemented yet
- Reliable gesture classification (fist, thumbs up, etc.)
- Kalman filter for landmark smoothing
- Gesture-to-key mapping using PyAutoGUI
- GUI for custom gesture profiles (Tkinter)
- Application-specific control modes (Media, Slides, Web)