Skip to content

Latest commit

 

History

History
75 lines (64 loc) · 2.34 KB

File metadata and controls

75 lines (64 loc) · 2.34 KB

Convolutional Neural Networks — Image Classification Project

📌 Project Overview

This project introduces Convolutional Neural Networks (CNNs) and their application to image classification problems.
You will learn key CNN concepts, implement classic architectures, apply modern techniques, and evaluate model performance.


🎯 Goals

  • Understand convolution and pooling operations
  • Implement LeNet and other CNN architectures
  • Explore modern architectures like ResNet and DenseNet
  • Apply data augmentations: flipping, cropping, CutOut
  • Apply MixUp and CutMix augmentations
  • Use Test-Time-Augmentations (TTA)
  • Achieve high performance on real-world image classification tasks

📂 Dataset

  • Zindi gesture dataset
  • Tasks:
    • Examine and clean data (remove duplicates, check labels)
    • Train-validation split (33% validation)
    • Custom PyTorch Dataset & DataLoader

🧠 CNN Concepts Covered

  • Convolution operation: feature extraction with parameter sharing
  • Pooling: max/average pooling for spatial invariance
  • LeNet: first CNN for image classification
  • Modern CNNs: ResNet, DenseNet, skip and dense connections
  • Augmentations: flip, crop, blur, contrast, CutOut
  • MixUp & CutMix: sample and label mixing for improved generalization
  • Test-Time-Augmentations: inference with multiple transformations

⚙️ Tasks

  1. Data Preparation
    • Inspect dataset
    • Handle duplicates and mislabeled examples
    • Design training/validation split
  2. PyTorch Dataset & DataLoader
    • Custom class using OpenCV2 & pandas
    • Proper API for sample retrieval
  3. Model Implementation
    • Rebuild LeNet using Conv2d, Pooling, Linear, Dropout
    • Implement training and validation loop
    • Loss: Binary Cross Entropy
    • Metric: ROC AUC (target ≥ 0.75 for LeNet)
  4. Modern Architectures
    • Use pretrained backbones (e.g., resnet18)
    • Replace head with linear layer
    • Train for 2–4 epochs (ROC AUC ≥ 0.9)
  5. Augmentations
    • Test augmentations from albumentations
    • Apply MixUp & CutMix
    • Evaluate impact on validation ROC AUC
  6. Bonus
    • Implement Test-Time-Augmentations (TTA)
    • Average predictions from multiple transforms

🛠 Tech Stack

  • Python 3
  • PyTorch
  • OpenCV2
  • Pandas
  • Albumentations
  • NumPy
  • scikit-learn
  • Matplotlib / Seaborn