GPU is available
auto_bg_thresh: 0
c_ratio: 0.3
class_weights: [1, 10, 10]
crop_box: None
crop_size: [16, 128, 128]
dataset_name: elephant-demo
debug: False
device: cuda
false_weight: 3
is_livemode: False
is_pad: False
keep_axials: (True, True, True, False)
log_dir: /workspace/logs/seg_log
lr: 5e-05
model_path: /workspace/models/seg.pth
n_crops: 3
n_epochs: 3
output_prediction: False
p_thresh: None
patch_size: None
r_max: None
r_min: None
rotation_angle: 0
scale_factor_base: 0
scales: [2.48, 0.3119629, 0.3119629]
timepoint: 0
use_median: None
zpath_input: /workspace/datasets/elephant-demo/imgs.zarr
zpath_seg_label: /workspace/datasets/elephant-demo/seg_labels.zarr
zpath_seg_label_vis: /workspace/datasets/elephant-demo/seg_labels_vis.zarr
zpath_seg_output: /workspace/datasets/elephant-demo/seg_outputs.zarr
[2021-05-11 18:00:15,663] ERROR in app: Exception on /train/seg [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "./main.py", line 544, in train_seg
config.device)
File "/usr/local/lib/python3.7/site-packages/elephant/models.py", line 313, in load_seg_models
checkpoint = torch.load(model_path)
File "/usr/local/lib/python3.7/site-packages/torch/serialization.py", line 525, in load
with _open_file_like(f, 'rb') as opened_file:
File "/usr/local/lib/python3.7/site-packages/torch/serialization.py", line 212, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/usr/local/lib/python3.7/site-packages/torch/serialization.py", line 193, in __init__
super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/workspace/models/seg.pth'
[pid: 10865|app: 0|req: 1/1] 127.0.0.1 () {40 vars in 519 bytes} [Tue May 11 18:00:15 2021] POST /train/seg => generated 290 bytes in 46 msecs (HTTP/1.1 500) 2 headers in 99 bytes (1 switches on core 0)
127.0.0.1 - - [11/May/2021:18:00:15 +0000] "POST /train/seg HTTP/1.1" 500 290 "-" "unirest-java/3.1.00" "-"
[pid: 10864|app: 0|req: 1/2] 127.0.0.1 () {40 vars in 517 bytes} [Tue May 11 18:00:38 2021] POST /reset/seg => generated 40 bytes in 4 msecs (HTTP/1.1 500) 2 headers in 90 bytes (1 switches on core 0)
127.0.0.1 - - [11/May/2021:18:00:38 +0000] "POST /reset/seg HTTP/1.1" 500 40 "-" "unirest-java/3.1.00" "-"
[pid: 10864|app: 0|req: 2/3] 127.0.0.1 () {40 vars in 517 bytes} [Tue May 11 18:00:47 2021] POST /reset/seg => generated 40 bytes in 0 msecs (HTTP/1.1 500) 2 headers in 90 bytes (1 switches on core 0)
127.0.0.1 - - [11/May/2021:18:00:47 +0000] "POST /reset/seg HTTP/1.1" 500 40 "-" "unirest-java/3.1.00" "-"
It would be nice if this was handled in a way that does not crash the server (within Elephant-client it says now that training is in progress and one cannot do anything).
(see title)
Here is the error:
It would be nice if this was handled in a way that does not crash the server (within Elephant-client it says now that training is in progress and one cannot do anything).