This project implements a real-time hand gesture recognition system on FPGA using a custom lightweight CNN.
It covers the full pipeline:
- Model training (TensorFlow)
- Quantization (INT8)
- FPGA acceleration (Vitis HLS)
- Real-time inference (PYNQ + DMA)
- Depthwise CNN (MobileNet-style)
- INT8 quantized inference
- Custom FPGA accelerator (no prebuilt IP)
- AXI Stream + DMA pipeline
- Real-time webcam input
- Debug dashboard (Flask)
Camera → Preprocessing → DMA → FPGA CNN → DMA → CPU → Output
- Input: 64×64 RGB
- 4 Depthwise Conv Blocks
- BatchNorm + ReLU
- Fully Connected Layers
- Output: 6 gesture classes
- Vitis HLS (C++)
- BRAM for feature maps
- ROM for weights
- INT8 fixed-point arithmetic
- Train model in TensorFlow
- Quantize weights to INT8
- Export weights to C headers
- Load into FPGA using HLS
- Run inference using DMA on PYNQ
- Live gesture detection
- ROI selection (center / skin / edges)
- Confidence filtering
- Debug visualization dashboard
- Python, TensorFlow, OpenCV
- Vitis HLS
- PYNQ (Zynq FPGA)
- Flask
- Better accuracy
- Faster inference (pipelining)
- Full hardware softmax
- IoT integration
