Radioastronomical object detector tool based on Mask R-CNN instance segmentation framework.
This software is a refactorization of https://github.com/SKA-INAF/caesar-mrcnn.git for TensorFlow v2.x.
This software is distributed with GPLv3 license. If you use it for your research, please add a reference to this github repository and acknowledge these works in your paper:
- S. Riggi et al., Astronomical source detection in radio continuum maps with deep neural networks, 2023, Astronomy and Computing, 42, 100682, doi
To build and install the package:
- Download the software in a local directory, e.g.
SRC_DIR:
$ git clone https://github.com/SKA-INAF/caesar-mrcnn-tf2.git - Create and activate a virtual environment, e.g.
caesar-mrcnn-tf2, under a desired pathVENV_DIR
$ python3 -m venv $VENV_DIR/caesar-mrcnn-tf2
$ source $VENV_DIR/caesar-mrcnn-tf2/bin/activate - Install dependencies inside venv:
(caesar-mrcnn-tf2)$ pip install -r $SRC_DIR/requirements.txt - Build and install package in virtual env:
(caesar-mrcnn-tf2)$ python setup.py install
To use package scripts:
- Add binary directory to your
PATHenvironment variable:
export PATH=$PATH:$VENV_DIR/caesar-yolo/bin
The software can be run using the provided script run.py in different modes:
- To train a model:
(caesar-mrcnn-tf2)$ python $VENV_DIR/bin/run.py [OPTIONS] train - To test a model:
(caesar-mrcnn-tf2)$ python $VENV_DIR/bin/run.py [OPTIONS] test - To detect objects on new data:
(caesar-mrcnn-tf2)$ python $VENV_DIR/bin/run.py [OPTIONS] inference
Supported options are:
INPUT DATA
--datalist=[VALUE]: Path to train/test data filelist containing a list of json files. Default: ''
--datalist_val=[VALUE]: Path to validation data filelist containing a list of json files. Default: ''
--maxnimgs=[VALUE]: Max number of images to consider in dataset (-1=all). Default: -1
--skip_classes: Skip certain classes when loading dataset. Default: disabled
--skipped_classes=[VALUE]: List of class names to be skipped in data loading'. Default: 'compact'
--require_classes: Require at least one oject class of required_classes to be present in image to consider this input data in data loading. Default: disabled
--required_classes=[VALUE]: List of required object classes. Default: 'extended,extended-multisland,flagged,spurious'
MODEL
--weights=[VALUE]: Path to model weights .h5 file. Default: ''
--backbone_weights=[VALUE]: Backbone network initialization weights: {random, imagenet, Path to weights .h5 file}. Default: 'random'
--classdict=[VALUE]: Class id dictionary used when loading dataset. Default: '{"sidelobe":1,"source":2,"galaxy":3}'
--classdict_model=[VALUE]: Class id dictionary used for the model (if empty, it is set equal to classdict). Default: ''
--remap_classids: Size in pixel used to resize input image. Default: 256
--classid_remap_dict=[VALUE]: Dictionary used to remap detected classid to gt classid. Default: ''
DATA PRE-PROCESSING
--imgsize=[VALUE]: Size in pixel used to resize input image. Default: 256
--normalize_minmax: Normalize each channel in range [norm_min, norm_max]. Default: no normalization
--norm_min=[VALUE]: Normalization min value. Default: 0.0
--norm_max=[VALUE]: Normalization max value. Default: 1.0
--subtract_bkg: Subtract bkg from ref channel image. Default: no subtraction
--sigma_bkg=[VALUE]: Sigma clip value used in bkg calculation. Default: 3.0
--use_box_mask_in_bkg: Compute bkg value in borders left from box mask. Default: not used
--bkg_box_mask_fract=[VALUE]: Size of mask box dimensions with respect to image size used in bkg calculation. Default: 0.7
--bkg_chid=[VALUE]: Channel used to subtract background (-1=all). Default: -1
--clip_shift_data: Apply sigma clip shifting. Default: not applied
--sigma_clip=[VALUE]: Sigma threshold to be used for clip & shifting pixels. Default: 1.0
--clip_data: Apply sigma clipping. Default: not applied
--sigma_clip_low=[VALUE]: Lower sigma threshold to be used for clipping pixels below (mean - sigma_low x stddev). Default: 10.0
--sigma_clip_up=[VALUE]: Upper sigma threshold to be used for clipping pixels above (mean + sigma_up x stddev). Default: 10.0
--clip_chid=[VALUE]: Channel used to clip data (-1=all). Default: -1
--zscale_stretch: Apply zscale transform to data. Default: not applied
--zscale_contrasts=[VALUES]: zscale contrasts applied to all channels, separated by commas. Default: 0.25,0.25,0.25
--chan3_preproc: Use the 3-channel pre-processor. Default: not used
--sigma_clip_baseline=[VALUE]: Lower sigma threshold to be used for clipping pixels below (mean - sigma_low x stddev) in first channel of 3-channel preprocessing. Default: 0.0
--nchannels=[VALUE]: Number of channels. If you modify channels in preprocessing you must set this option accordingly. Default: 1
DATA AUGMENTATION
--use_augmentation: Run data augmentation on input images. Default: disabled
--augmenter=[VALUE]: Augmenter version to be used {"v1","v2","v3"}. "v1" is equal to TF1 caesar-mrcnn. Default: "v1"
MODEL TRAINING
--nepochs=[VALUE]: Number of training epochs. Default: 1
--rpn_anchor_scales=[VALUES]: RPN anchor scales in pixels (5 comma-separated values). Default: '4,8,16,32,64'
--max_gt_instances=[VALUE]: Max GT instances. Default: 300
--backbone=[VALUE]: Backbone network {resnet101,resnet50,custom}. Default: resnet101
--freeze_backbone: Freeze backbone weights. Default: free
--backbone_strides=[VALUES]: Backbone strides in pixels (5 comma-separated values). Default: '4,8,16,32,64'
--rpn_nms_threshold=[VALUE]: RPN Non-Maximum-Suppression threshold. Default: 0.7
--rpn_train_anchors_per_image=[VALUE]: Number of anchors per image to use for RPN training. Default: 512
--train_rois_per_image=[VALUE]: Number of ROIs per image to feed to classifier/mask heads. Default: 512
--rpn_anchor_ratios=[VALUES]: RPN anchor ratios, comma separated. Default: '0.5,1,2'
--rpn_class_loss_weight=[VALUE]: RPN classification loss weight. Default: 1.0
--rpn_bbox_loss_weight=[VALUE]: RPN bounding box loss weight. Default: 1.0
--mrcnn_class_loss_weight=[VALUE]: Classification loss weight. Default: 1.0
--mrcnn_bbox_loss_weight=[VALUE]: Bounding box loss weight. Default: 1.0
--mrcnn_mask_loss_weight=[VALUE]: Mask loss weight. Default: 1.0
--rpn_class_loss: Enable RPN classification loss.
--no_rpn_class_loss: Disable RPN classification loss.
--rpn_bbox_loss: Enable RPN box loss.
--no_rpn_bbox_loss: Disable RPN box loss.
--mrcnn_class_loss: Enable classification loss.
--no_mrcnn_class_loss: Disable classification loss.
--mrcnn_bbox_loss: Enable box loss.
--no_mrcnn_bbox_loss: Disable box loss.
--mrcnn_mask_loss: Enable mask loss.
--no_mrcnn_mask_loss: Disable mask loss.
--no_l2reg_loss: Disable L2 regularization loss.
--weight_classes: Enable weighting of object classes.
--optimizer=[VALUE]: Optimizer {sgd,adam,adamax}. Default: sgd
--learning_rate=[VALUE]: Learning rate. Default: 0.0005
--opt_momentum=[VALUE]: Momentum parameter in SGD. Default: 0.9
--opt_clipnorm=[VALUE]: clipnorm optimizer parameter. Default: 5.0
--opt_clipvalue=[VALUE]: clipvalue optimizer parameter. Default: None
--enable_checkpoints: Enable saving of model checkpoints. Default: disabled
MODEL INFERENCE
--image=[VALUE]: Input image in FITS format used in inference. Default: ''
--xmin=[VALUE]: Image min x to be read (read all if -1). Default: -1
--xmax=[VALUE]: Image max x to be read (read all if -1). Default: -1
--ymin=[VALUE]: Image min y to be read (read all if -1). Default: -1
--ymax=[VALUE]: Image max y to be read (read all if -1). Default: -1
--scoreThr=[VALUE]: Object detection score threshold to be used during test/inference. Default: 0.7
--iouThr=[VALUE]: IOU threshold used to match detected objects with true objects. Default: 0.6
--consider_sources_near_mixed_sidelobes: Consider sources tagged as mixed with sidelobes for the inference/test.
--no_consider_sources_near_mixed_sidelobes: Do not consider sources tagged as mixed with sidelobes for the inference/test.
RUN
--ngpu: Number of GPUs used. Default: 1
--nimg_per_gpu: Number of images per gpu. Default: 1
PARALLEL PROCESSING
--split_img_in_tiles: Enable splitting of input image in multiple subtiles for parallel processing. Default: disabled
--tile_xsize=[VALUE]: Sub image size in pixel along x. Default: 512
--tile_ysize=[VALUE]: Sub image size in pixel along y. Default: 512
--tile_xstep=[VALUE]: Sub image step fraction along x (=1 means no overlap). Default: 1.0
--tile_ystep=[VALUE]: Sub image step fraction along y (=1 means no overlap). Default: 1.0
--max_ntasks_per_worker=[VALUE]: Max number of tasks assigned to a MPI processor worker. Default: 100
PLOTTING
--draw: Enable plotting. Default: disabled
--draw_shaded_masks: Enable plotting of shaded masks. Default: disabled
--draw_class_label_in_caption: Enable plotting of class labels inside object caption. Default: disabled
OUTPUT DATA
--save_plots: Enable saving of inference plots. Default: disabled
--detect_outfile: Output plot PNG filename (internally generated if left empty). Default: empty
--detect_outfile_json: Output json filename with detected objects (internally generated if left empty). Default: empty
Below, we report a sample run script for training Mask-RNN model:
#!/bin/bash
######################
## SET ENV
######################
VENV_DIR="/opt/software/venvs/caesar-mrcnn-tf2"
SCRIPT_DIR="$VENV_DIR/bin"
source $SCRIPT_DIR/activate
######################
## RUN OPTIONS
######################
# - DATA OPTIONS
TRAIN_DATA="/opt/data/train.dat"
CV_DATA="/opt/data/crossval.dat"
CLASS_DICT="{\"spurious\":1,\"compact\":2,\"extended\":3,\"extended-multisland\":4,\"flagged\":5}"
# - TRAIN OPTIONS
NEPOCHS=100
WEIGHTS="" # train from scratch
# - PREPROCESSING OPTIONS
IMGSIZE=224
PREPROC_OPTS="--imgsize=$IMGSIZE --nchannels=3 --zscale_stretch --zscale_contrasts=0.0,0.25,0.4 --clip_data --sigma_clip_low=5 --sigma_clip_up=30 --clip_chid=0 --normalize_minmax "
# - OPTIMIZER OPTIONS
##LEARNING_RATE="1.e-6"
LEARNING_RATE="5.e-4"
MOMENTUM="0.9"
CLIPNORM="5.0"
OPTIMIZER_OPTS="--optimizer=$OPTIMIZER --learning_rate=$LEARNING_RATE --opt_momentum=$MOMENTUM --opt_clipnorm=$CLIPNORM "
# - RUN OPTIONS
NGPU=1
NIMG_PER_GPU=1
NTHREADS=1
RUN_OPTS="--ngpu=$NGPU --nimg_per_gpu=$NIMG_PER_GPU "
# - MODEL ARCHITECTURE OPTIONS
BACKBONE="resnet101"
BACKBONE_WEIGHTS="random"
RPN_ANCHOR_SCALES="8,16,32,64,128"
MAX_GT_INSTANCES=200
BACKBONE_STRIDES="4,8,16,32,64" ## this are for resnet101
RPN_NMS_THRESHOLD=0.7
RPN_TRAIN_ANCHORS_PER_IMAGE=256
TRAIN_ROIS_PER_IMAGE=256
RPN_ANCHOR_RATIOS="0.2,0.3,0.5,1,2,3,4,5"
MODEL_ARC_OPTS="--backbone=$BACKBONE --backbone_weights=$BACKBONE_WEIGHTS --rpn_anchor_scales=$RPN_ANCHOR_SCALES --max_gt_instances=$MAX_GT_INSTANCES --backbone_strides=$BACKBONE_STRIDES --rpn_nms_threshold=$RPN_NMS_THRESHOLD --rpn_train_anchors_per_image=$RPN_TRAIN_ANCHORS_PER_IMAGE --train_rois_per_image=$TRAIN_ROIS_PER_IMAGE --rpn_anchor_ratio=$RPN_ANCHOR_RATIOS "
# - LOSS OPTIONS
MRCNN_BBOX_LOSS_WEIGHT=0.1
MRCNN_CLASS_LOSS_WEIGHT=1.0
MRCNN_MASK_LOSS_WEIGHT=0.1
RPN_BBOX_LOSS_WEIGHT=0.1
RPN_CLASS_LOSS_WEIGHT=1.0
LOSS_OPTS="--rpn_class_loss_weight=$RPN_CLASS_LOSS_WEIGHT --rpn_bbox_loss_weight=$RPN_BBOX_LOSS_WEIGHT --mrcnn_class_loss_weight=$MRCNN_CLASS_LOSS_WEIGHT --mrcnn_bbox_loss_weight=$MRCNN_BBOX_LOSS_WEIGHT -
-mrcnn_mask_loss_weight=$MRCNN_MASK_LOSS_WEIGHT "
# - CLASS WEIGHTS
CLASS_WEIGHTS_OPTS=""
# - AUGMENTATION
AUG_OPTS="--use_augmentation --augmenter=v1 "
##################################
## RUN
##################################
echo "INFO: Start run ..."
date
python $VENV_DIR/bin/run.py --datalist=$TRAIN_DATA --datalist_val=$CV_DATA \
--classdict=$CLASS_DICT --classdict_model=$CLASS_DICT \
--weights=$WEIGHTS --nepochs=$NEPOCHS \
$RUN_OPTS \
$PREPROC_OPTS \
$OPTIMIZER_OPTS \
$MODEL_ARC_OPTS \
$LOSS_OPTS \
$CLASS_WEIGHTS_OPTS \
$AUG_OPTS \
train
echo "INFO: End run"
date
We have trained the Mask-RCNN TF2 model from scratch on the same annotated radio dataset that was previously used to train Mask R-CNN TF1 model in paper Riggi+2023 (see Credits for full reference). Models were trained to detect 5 classes of radio objects:
0: spurious
1: compact
2: extended
3: extended-multisland
4: flagged
See the original publication for a description of each class.
We provide below the training configuration we used for producing the models and links to pre-trained model weights.
Training configuration
TRAIN_OPTS="--nepochs=250 "
CLASS_DICT="{\"spurious\":1,\"compact\":2,\"extended\":3,\"extended-multisland\":4,\"flagged\":5}"
DATA_OPTS="--classdict=$CLASS_DICT --classdict_model=$CLASS_DICT "
PREPROC_OPTS="--imgsize=256 --nchannels=3 --zscale_stretch --zscale_contrasts=0.25,0.25,0.25 --clip_data --sigma_clip_low=5 --sigma_clip_up=20 --normalize_minmax "
AUG_OPTS="--use_augmentation --augmenter=v1 "
LOSS_OPTS="--rpn_class_loss_weight=1.0 --rpn_bbox_loss_weight=0.1 --mrcnn_class_loss_weight=1.0 --mrcnn_bbox_loss_weight=0.1 --mrcnn_mask_loss_weight=0.1 "
MODEL_ARC_OPTS="--backbone=resnet101 --backbone_weights=random --rpn_anchor_scales=8,16,32,64,128 --max_gt_instances=200 --backbone_strides=4,8,16,32,64 --rpn_nms_threshold=0.7 --rpn_train_anchors_per_image=256 --train_rois_per_image=256 --rpn_anchor_ratio=0.2,0.3,0.5,1,2,3,4,5 "
OPTIMIZER_OPTS="--optimizer=sgd --learning_rate=5.e-4 --opt_momentum=0.9 --opt_clipnorm=5.0 "
Trained models
| Model Base | Img Size | Weights | File Size | Notes |
|---|---|---|---|---|
| resnet101 | 256 | url | 245 MB |