Description
Hello
I am trying to train the model for my custom data of just 200-300 images. Our dataset generation is in the process so, I am just setting up the grounds to train this model for my custom data. I have a single GPU for training and I want to use Mobilenet.
The command I run is:
python3 train.py --gpus 0 --cfg config/custom-mobilenetv2dilated-c1_deepsup.yaml
but I encounter the following error. Can you please help me out with this?
[2023-04-13 01:45:10,405 INFO train.py line 240 26317] Loaded configuration file config/custom-mobilenetv2dilated-c1_deepsup.yaml
[2023-04-13 01:45:10,405 INFO train.py line 241 26317] Running with config:
DATASET:
imgMaxSize: 1000
imgSizes: (300, 375, 450, 525, 600)
list_train: ./data/training.odgt
list_val: ./data/validation.odgt
num_class: 3
padding_constant: 8
random_flip: True
root_dataset: ./data/
segm_downsampling_rate: 8
DIR: ckpt/custom-mobilenetv2dilated-c1_deepsup
MODEL:
arch_decoder: c1_deepsup
arch_encoder: mobilenetv2dilated
fc_dim: 320
weights_decoder:
weights_encoder:
TEST:
batch_size: 1
checkpoint: epoch_20.pth
result: ./
TRAIN:
batch_size_per_gpu: 3
beta1: 0.9
deep_sup_scale: 0.4
disp_iter: 20
epoch_iters: 5000
fix_bn: False
lr_decoder: 0.02
lr_encoder: 0.02
lr_pow: 0.9
num_epoch: 20
optim: SGD
seed: 304
start_epoch: 0
weight_decay: 0.0001
workers: 16
VAL:
batch_size: 1
checkpoint: epoch_20.pth
visualize: False
[2023-04-13 01:45:10,405 INFO train.py line 246 26317] Outputing checkpoints to: ckpt/custom-mobilenetv2dilated-c1_deepsup
samples: 135
1 Epoch = 5000 iters
Traceback (most recent call last):
File "train.py", line 273, in
main(cfg, gpus)
File "train.py", line 200, in main
train(segmentation_module, iterator_train, optimizers, history, epoch+1, cfg)
File "train.py", line 32, in train
batch_data = next(iterator)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/e/semantic-segmentation-pytorch-copy/mit_semseg/dataset.py", line 162, in getitem
assert(segm.mode == "L")
AssertionError