Skip to content

EuropeanIFCBGroup/SAMS_IFCBAnnotator

Repository files navigation

IFCB ANNOTATOR

This IFCB Annotator is a standalone browser-based annotation and reporting tool. It runs on the Django web framework for Python. It uses a back-end relational database to store data imported from IFCB data files (.hdr, .adc, .roi).

It is highly recommended that the project is used with a separate database server which is backed up regularly. It is provided with support for an SQLite database which is located in the project root folder which may lead to data loss if not backed up regularly. Django is designed to work with most database engines and a separate database server which is regularly backed up and maintained is highly recommended.

IFCB Annotator is in active development with additional features planned for the future.

Installation

STEP 1 - PREREQUSITES

  • Python (3.11.9 was used in development but should work with later versions)
  • this repository.
  • (recommended) Visual Studio Code (or your IDE of choice) for launching the web service
  • (recommended) A database management tool such as DBeaver or DBBrowser (SQLite) are recommended

STEP 2 - Setup Virtual Environment

In a command prompt navigate to the folder where you unpacked the project.

Setup a virtual environment using Python venv:

python -m venv .venv

This will install a virtual environment in a sub folder called .venv in this project (you can create your virtual environment any where you wish - see the VENV python docs for more info)

Activate the VENV windows -

.venv\Scripts\activate.bat

Install all the required packages

pip install -r requirements.txt

STEP 3 - Setup the DB

By default the project is setup to use a local database stored in the project folder

python manage.py makemigrations

python manage.py migrate

then

python manage.py makemigrations portal

python manage.py migrate portal

(for some reason I had to do this twice - once to setup the initial DB and then again to create all the tables specified in the models.py file)

Next you'll need to create a super user to access the annotator and the admin pages:

python manage.py createsuperuser

Enter your chosen username, email and password when asked.

Next edit the .env file and change the DB_USERNAME and DB_PASS to the username and password you just entered.

STEP 4 - Run the Annotator

Launch the local web server by running the following command:

python manage.py runserver

Check the website is running by navigating to http://localhost:8000 in your browser

STEP 5 - Create a dataset

Once the website is up an running navigate to the admin pages using the menu. Login using the username and password you specified in Step 3.

Add a new entry in the dataset table. The following fields are required:

  • Dataset Name
  • Dataset Folder (where your IFCB files are stored)
  • Active - set to True
  • Dataset Icon - (set this to IFCB.png - this is stored in portal/static/images/icons/)

IMPORTANT add the ending backslash / when specifying filepath for the dataset folder! (ie ifcb_data/ifcb147/)

STEP 6 - Create a Classifier

Don't have your own classifier yet? Check out the following GitHub project for a Jupyter Notebook on how to create your first classifier using PyTorch: https://github.com/EuropeanIFCBGroup/IFCBClassify_DEMO

if you have a classifier - Add an entry in the classifier table and then populate the classifier_classes with indexes from the dataset the classifier was trained on (note: if using PyTorch it isn't neccessarily alphabetical) if you don't have a classifier - Add a dummy classifier and add an initial row for the classifier in classifier_classes

You should also add a new entry in the portal_classifier_classes table for 'unclassified'. This is used to catch any images that have not yet been classified or fall below a confidence threshold. If you have already populated the classifier_classes with indexes add a record at the end. If not just create a new record.

Classifier Index - 0 if you have no classifier classes or one after the last class index if you already have classifier classes from your own classifier. Model_ID - the ID of the classifier you created above Master Class - unclassified Class name - unclassified Class Type - Functional

Finally set the UNCLASSIFIED_INDEX in the .env file to the Classifier Index of the record you just created.

STEP 7 (Optional but recommended) - Populate the Master Class List

There is an SQL file provided to prepopulate the Master Class List with many genus and species (including their AphiaID) called home_master_classes.sql

You will need to run this script directly against the database in an SQL prompt. A separate Database Management program such as DBeaver of DB Browser for SQLite are recommended.

STEP 8 - Import IFCB Data

Run the IFCB_Populate.py script. This will populate your database tables from your IFCB files.

At present only a falt folder structure is supported (ie all sample files directly within the dataset folder) however in future it is planned to support nested folders (ie if you use the Tree feature in IFCBAcquire).

Some sample IFCB files are porvided for demonstration purposes in ifcb_data/IFCB147/. Feel free to replace these with your own IFCB files.

You can run this script every time you wish to import new IFCB files. It will ignore any files already imported to the database and only add new ones.

STEP 9 (optional)

if you have a classifier run this and assign class_index and class_score to each image and set image_processed to True

The pseudocode below is intended to give a general idea of how to integrate a classifier. Check out the IFCBClassify_DEMO repo for details on training your own classifier.

Pseudocode for classifier integration

load the PyTorch model (move to CUDA if available)
create a transform (match the settings used when training the model)

get a list of bins where ready_to_classify = True and classified = False
for each bin:
    select all images in bin where verfied = false and image_data = True
    for each image:
        load image from folder (file format: "[dataset_name]/[bin_name]/[bin_name]_t[image_trigger]_x[image_x]_y[image_y]_[image_id].png" )
        pass image through model and return max_index

        update image record - set image processed = true, class_index = max_index, model_id = classifier_id in DB
    update bin record - set classified = True

About

IFCB annotation and reporting tools.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors