Fine-tune google/vit-base-patch16-224 for dog breed classification using HuggingFace Transformers.
make install
make gpu-checkOption A — CSV + image folder (recommended)
Place labels.csv and images in data/dogs/:
data/dogs/
├── labels.csv # columns: id,breed (id = filename without extension)
├── train/
│ ├── 000bec180eb18c7604dcecc8fe0dba07.jpg
│ ├── 001513dfcb2ffafc82cccf4dbbaba97.jpg
│ └── ...
└── test/ # optional, for inference
Option B — Breed subdirectories
data/dogs/
├── labrador/
│ ├── img001.jpg
│ └── ...
├── golden_retriever/
│ └── ...
└── poodle/
└── ...
Validate the dataset:
make data-checkmake trainTraining config is in config.py. The best model is saved to outputs/best/.
To resume an interrupted run:
make train-resumemake evalmake infer IMG=path/to/dog.jpg
make infer-dir DIR=path/to/images/| File | Description |
|---|---|
config.py |
Model, training, and inference configuration |
dataset.py |
Image loading, augmentation, and preprocessing |
train.py |
Fine-tuning with HuggingFace Trainer |
inference.py |
Single-image and batch inference |
Makefile |
Development lifecycle commands |