Skip to content

Error during training for custom data #282

Open
@rocker12121

Description

@rocker12121

Hello

I am trying to train the model for my custom data of just 200-300 images. Our dataset generation is in the process so, I am just setting up the grounds to train this model for my custom data. I have a single GPU for training and I want to use Mobilenet.

The command I run is:
python3 train.py --gpus 0 --cfg config/custom-mobilenetv2dilated-c1_deepsup.yaml

but I encounter the following error. Can you please help me out with this?

[2023-04-13 01:45:10,405 INFO train.py line 240 26317] Loaded configuration file config/custom-mobilenetv2dilated-c1_deepsup.yaml
[2023-04-13 01:45:10,405 INFO train.py line 241 26317] Running with config:
DATASET:
imgMaxSize: 1000
imgSizes: (300, 375, 450, 525, 600)
list_train: ./data/training.odgt
list_val: ./data/validation.odgt
num_class: 3
padding_constant: 8
random_flip: True
root_dataset: ./data/
segm_downsampling_rate: 8
DIR: ckpt/custom-mobilenetv2dilated-c1_deepsup
MODEL:
arch_decoder: c1_deepsup
arch_encoder: mobilenetv2dilated
fc_dim: 320
weights_decoder:
weights_encoder:
TEST:
batch_size: 1
checkpoint: epoch_20.pth
result: ./
TRAIN:
batch_size_per_gpu: 3
beta1: 0.9
deep_sup_scale: 0.4
disp_iter: 20
epoch_iters: 5000
fix_bn: False
lr_decoder: 0.02
lr_encoder: 0.02
lr_pow: 0.9
num_epoch: 20
optim: SGD
seed: 304
start_epoch: 0
weight_decay: 0.0001
workers: 16
VAL:
batch_size: 1
checkpoint: epoch_20.pth
visualize: False
[2023-04-13 01:45:10,405 INFO train.py line 246 26317] Outputing checkpoints to: ckpt/custom-mobilenetv2dilated-c1_deepsup

samples: 135

1 Epoch = 5000 iters
Traceback (most recent call last):
File "train.py", line 273, in
main(cfg, gpus)
File "train.py", line 200, in main
train(segmentation_module, iterator_train, optimizers, history, epoch+1, cfg)
File "train.py", line 32, in train
batch_data = next(iterator)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/e/anaconda3/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/e/semantic-segmentation-pytorch-copy/mit_semseg/dataset.py", line 162, in getitem
assert(segm.mode == "L")
AssertionError

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions