Custom Dataset Low Map and No Convergence(with images!)

I am using custom data. I have modified the callbacks in train.py so that it is now monitoring validation loss from the validation set, as opposed to the training set. Original learning rate was 1e-4
`def create_callbacks(saved_weights_name, tensorboard_logs, model_to_save):
    makedirs(tensorboard_logs)
    
    early_stop = EarlyStopping(
        monitor     = 'val_loss', 
        min_delta   = 0.001, 
        patience    = 50, 
        mode        = 'min', 
        verbose     = 1
    )
    checkpoint = CustomModelCheckpoint(
        model_to_save   = model_to_save,
        filepath        = saved_weights_name,# + '{epoch:02d}.h5', 
        monitor         = 'val_loss', 
        verbose         = 1, 
        save_best_only  = True, 
        mode            = 'min', 
        period          = 1
    )
    reduce_on_plateau = ReduceLROnPlateau(
        monitor  = 'val_loss',
        factor   = 0.1,
        patience = 10,
        verbose  = 1,
        mode     = 'min',
        epsilon  = 0.01,
        cooldown = 0,
    )
    tensorboard = CustomTensorBoard(
        log_dir                = tensorboard_logs,
        write_graph            = True,
        write_images           = True,
    )    
    return [early_stop, checkpoint, reduce_on_plateau, tensorboard]`


```
history = train_model.fit_generator(
          generator        = train_generator, 
          validation_data = valid_generator,
          steps_per_epoch  = len(train_generator) * config['train']['train_times'], 
          epochs           = config['train']['nb_epochs'] + config['train']['warmup_epochs'], 
          verbose          = 2 if config['train']['debug'] else 1,
          callbacks        = callbacks, 
          workers          = 4,
          max_queue_size   = 8
      )
```
Here is a plot of loss, blue being validation loss and red being training loss:
![loss](https://user-images.githubusercontent.com/84977346/120856838-622a6a80-c54e-11eb-87c4-1648ca4e0380.png)

And here is a sample image of my dataset:

![sample_img](https://user-images.githubusercontent.com/84977346/120857304-03b1bc00-c54f-11eb-99f1-8ca28796809a.png)

where same numbers are top left/bottom right of a bounding box. This example was pretty messy, but other people who have used this dataset claim to reach 70%+ map on a default single shot detector with no modifications. My maximum was 0.2. I am using the default yolov3 and haven't made any configurations. 
Here is my config:
`{
    "model" : {
        "min_input_size":       352,
        "max_input_size":       352,
        "anchors":              [13,58, 18,14, 30,31, 47,55, 50,128, 75,24, 95,69, 126,121, 161,208],
        "labels":               ["nlb"]
    },

    "train": {
        "train_image_folder":   "/content/drive/MyDrive/yolo/nlb_train_image/",
        "train_annot_folder":   "/content/drive/MyDrive/yolo/nlb_train_annot/",
	      "cache_name":		"nlb_train.pkl",
	      "pretrained_weights":   "",

        "train_times":          1,
        "batch_size":           16,
        "learning_rate":        1e-4,
        "nb_epochs":            200,
        "warmup_epochs":        3,
        "ignore_thresh":        0.4,
        "gpus":                 "0",

        "grid_scales":          [1,1,1],
        "obj_scale":            5,
        "noobj_scale":          1,
        "xywh_scale":           1,
        "class_scale":          1,

        "tensorboard_dir":      "logs",
        "saved_weights_name":   "nlb.h5",
        "debug":                true
    },

    "valid": {
        "valid_image_folder":   "/content/drive/MyDrive/yolo/nlb_valid_image/",
        "valid_annot_folder":   "/content/drive/MyDrive/yolo/nlb_valid_image/",
        "cache_name":		"nlb_valid.pkl",

        "valid_times":          1
    }
}
`

Data augmented from 612 -> 7140 for training set, valid_set = 152, test_set = 141. I have played around with increasing validation set but still get poor results. I changed initial learning rate from 1e-3 to 1e-5 but again, poor results. If anyone is willing to test it themselves, here are links to the dataset:

train images
https://drive.google.com/drive/folders/1qmkCFNNuAtsOtFzQpmEgZyaN-DeSOuyr?usp=sharing
train annot
https://drive.google.com/drive/folders/1-5SZAcuOXz1Rt_eZ19RtVvgMXr8aId6j?usp=sharing
valid images
https://drive.google.com/drive/folders/1Xxz28iVaxXMIDA_UVQPKOAs0KfU4KtVT?usp=sharing
valid annot
https://drive.google.com/drive/folders/1Ml7DjEBwS50eUaMO378zYGdZTQsdfUNB?usp=sharing
test images
https://drive.google.com/drive/folders/1tEiW-Xx74kWua810sR5e0OyWvPABhJ1z?usp=sharing
test annot
https://drive.google.com/drive/folders/12jvZpb06tRopSpYLUSoiONOlvBWPV1ud?usp=sharing
train.pkl file
https://drive.google.com/file/d/1-5a1JGGYr6xP8ydCJOBOfhTPOym93u62/view?usp=sharing
valid.pkl file
https://drive.google.com/file/d/1-86kPCtYQKmYVzxyFZfQoNIf03_YdwDA/view?usp=sharing

I have spent an embarrassing amount of time trying to solve this problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom Dataset Low Map and No Convergence(with images!) #313

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Custom Dataset Low Map and No Convergence(with images!) #313

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions