This repository contains all the code I used for the BirdCLEF+ 2025 competition on Kaggle.
My final submission placed 925th, with a 0.824 ROC-AUC score on the private test set.
The BirdCLEF+ 2025 competition focused on detecting species (birds, amphibians, mammals, and insects) from 1-minute soundscape recordings collected in El Silencio Natural Reserve, Colombia. The goal was to help ecologists monitor biodiversity using acoustic monitoring, which enables large-scale and frequent data collection. Participants were asked to train machine learning models that can identify which species are calling in short audio segments, using a small labeled dataset and a larger set of unlabeled recordings. Some of the challenges with this competition were limited compute (the final notebook had to run exclusively on CPU), extremely large class imbalance, and very noisy data that was from a different distribution (different recording location) from the test set.
The evaluation metric was a version of macro-averaged ROC-AUC, calculated per species and ignoring classes without any true positive examples.
My final approach was based on training an EfficientNet model on mel-spectrograms. During inference, I applied several tricks, such as adjusting the power of low-rank columns to improve robustness. Many people had similar approaches, however I only settled on this method after thorough experiments with many other methods.
Before settling on the final solution, I experimented with several other ideas, each of which is available in a separate branch:
-
Black Box Shift Estimation (
Black-Box-Shift-Estimation
branch): I tried to estimate the distribution of classes in the test set using the unlabeled soundscapes. -
Domain Adversarial Training (
DANN
branch): I used a domain classifier to help the model generalize better to the distribution of the unlabeled/test data. Based on this paper. -
Fine-Grained Recognition (
Fine-Grained
branch): I trained a CNN to pick up on subtle differences in spectrograms to better distinguish between similar spectrograms. Based on this paper.
To run my final solution do the following:
- Install all the dependencies using
requirements.txt
. - Place the whole dataset from kaggle in the folder named
data
. - Run:
python3 src/preprocess.py
- Then finally:
python3 src/train.py
- The script to run inference can be found in
src/submit.py
.