This IFCB Annotator is a standalone browser-based annotation and reporting tool. It runs on the Django web framework for Python. It uses a back-end relational database to store data imported from IFCB data files (.hdr, .adc, .roi).
It is highly recommended that the project is used with a separate database server which is backed up regularly. It is provided with support for an SQLite database which is located in the project root folder which may lead to data loss if not backed up regularly. Django is designed to work with most database engines and a separate database server which is regularly backed up and maintained is highly recommended.
IFCB Annotator is in active development with additional features planned for the future.
- Python (3.11.9 was used in development but should work with later versions)
- this repository.
- (recommended) Visual Studio Code (or your IDE of choice) for launching the web service
- (recommended) A database management tool such as DBeaver or DBBrowser (SQLite) are recommended
In a command prompt navigate to the folder where you unpacked the project.
Setup a virtual environment using Python venv:
python -m venv .venv
This will install a virtual environment in a sub folder called .venv in this project (you can create your virtual environment any where you wish - see the VENV python docs for more info)
Activate the VENV windows -
.venv\Scripts\activate.bat
Install all the required packages
pip install -r requirements.txt
By default the project is setup to use a local database stored in the project folder
python manage.py makemigrations
python manage.py migrate
then
python manage.py makemigrations portal
python manage.py migrate portal
(for some reason I had to do this twice - once to setup the initial DB and then again to create all the tables specified in the models.py file)
Next you'll need to create a super user to access the annotator and the admin pages:
python manage.py createsuperuser
Enter your chosen username, email and password when asked.
Next edit the .env file and change the DB_USERNAME and DB_PASS to the username and password you just entered.
Launch the local web server by running the following command:
python manage.py runserver
Check the website is running by navigating to http://localhost:8000 in your browser
Once the website is up an running navigate to the admin pages using the menu. Login using the username and password you specified in Step 3.
Add a new entry in the dataset table. The following fields are required:
- Dataset Name
- Dataset Folder (where your IFCB files are stored)
- Active - set to True
- Dataset Icon - (set this to IFCB.png - this is stored in portal/static/images/icons/)
IMPORTANT add the ending backslash / when specifying filepath for the dataset folder! (ie ifcb_data/ifcb147/)
Don't have your own classifier yet? Check out the following GitHub project for a Jupyter Notebook on how to create your first classifier using PyTorch: https://github.com/EuropeanIFCBGroup/IFCBClassify_DEMO
if you have a classifier - Add an entry in the classifier table and then populate the classifier_classes with indexes from the dataset the classifier was trained on (note: if using PyTorch it isn't neccessarily alphabetical) if you don't have a classifier - Add a dummy classifier and add an initial row for the classifier in classifier_classes
You should also add a new entry in the portal_classifier_classes table for 'unclassified'. This is used to catch any images that have not yet been classified or fall below a confidence threshold. If you have already populated the classifier_classes with indexes add a record at the end. If not just create a new record.
Classifier Index - 0 if you have no classifier classes or one after the last class index if you already have classifier classes from your own classifier. Model_ID - the ID of the classifier you created above Master Class - unclassified Class name - unclassified Class Type - Functional
Finally set the UNCLASSIFIED_INDEX in the .env file to the Classifier Index of the record you just created.
There is an SQL file provided to prepopulate the Master Class List with many genus and species (including their AphiaID) called home_master_classes.sql
You will need to run this script directly against the database in an SQL prompt. A separate Database Management program such as DBeaver of DB Browser for SQLite are recommended.
Run the IFCB_Populate.py script. This will populate your database tables from your IFCB files.
At present only a falt folder structure is supported (ie all sample files directly within the dataset folder) however in future it is planned to support nested folders (ie if you use the Tree feature in IFCBAcquire).
Some sample IFCB files are porvided for demonstration purposes in ifcb_data/IFCB147/. Feel free to replace these with your own IFCB files.
You can run this script every time you wish to import new IFCB files. It will ignore any files already imported to the database and only add new ones.
if you have a classifier run this and assign class_index and class_score to each image and set image_processed to True
The pseudocode below is intended to give a general idea of how to integrate a classifier. Check out the IFCBClassify_DEMO repo for details on training your own classifier.
load the PyTorch model (move to CUDA if available)
create a transform (match the settings used when training the model)
get a list of bins where ready_to_classify = True and classified = False
for each bin:
select all images in bin where verfied = false and image_data = True
for each image:
load image from folder (file format: "[dataset_name]/[bin_name]/[bin_name]_t[image_trigger]_x[image_x]_y[image_y]_[image_id].png" )
pass image through model and return max_index
update image record - set image processed = true, class_index = max_index, model_id = classifier_id in DB
update bin record - set classified = True