This repo was used to enter the AIcrowd Snake Species Identification Challenge; an entry based on this repo placed first in the first qualifying round of the competition. The competition aims to stimulate the development of a snake species identification application, so bite victims and health practitioners can prioritize care for potentially-harmful bites. For the qualifying round, the competition provided ~82k images and ~18k test images covering 45 species.
The competition entry built on the core repo tools; new code was written to:
- Converted the competition data set to COCO format, for compatibility with the training code
- Call into the existing code to train both ResNeXt and Inceptionv4 networks (around 80 epochs)
- Aggregate results from the ResNeXt and Inceptionv4 models (post-hoc averaging)
- Run inference on the test data and prepare a submission for the competition
This approach yielded an F1 of 0.809
for Inceptionv4
and 0.804
for ResNext101
. The averaged predictions achieved an F1 of 0.846
, which placed first in the qualifying round of the competition.
-
Follow the steps in README.md to create the required docker or conda environment.
-
Download the training and test data from here.
-
Unzip the training and test zipfiles into a folder called "data" in the PyTorchClassification directory, or symlink a directory called
data
to point to your data directory. When you unzip the training data, images should end up indata/train
(e.g.data/train/[class]/[filename].jpg
). Test data should end up indata/round1/[filename].jpg
. -
Run the following commands:
# cd into the PyTorchClassification directory
python snakes/folder_to_coco.py # Creates the COCO annotation format for the dataset
python run_snakes_training.py # Trains both ResNext101 and Inceptionv4 architectures
python snakes/test_snakes.py # Generates predictions on the test dataset
python snakes/merge_snakes_results.py # Merges the results from the two different architectures