Lack of multi-worker dataloading severly hinders the training speed

If you run the included train_3D.py and monitor your hardware, you will see something like this:
<img width="1581" height="848" alt="Image" src="https://github.com/user-attachments/assets/fc4c9922-fedb-4d8f-a17e-2d3428dc0efb" />
The average power draw and GPU utilisation are very low. It's clear that this is because all the image data were loaded single-threadedly.
I am using a very old GPU (Radeon VII). The problem only gets worse with a better GPU. 

Unfortunately, it doesn't seem possible to increase the number of workers for data loading. If you try to increase the number of workers by setting the 'num_workers' argument in PyTorch's Dataloader to a value greater than 0, you will get this:
```
Training:   0%|          | 0/200 [00:00<?, ?it/s]WARNING:cellmap_data.dataloader:Worker failed to get item: cannot pickle '_thread.lock' object, falling back to main thread
WARNING:cellmap_data.dataloader:Worker failed to get item: cannot pickle '_thread.lock' object, falling back to main thread
Training:   0%|          | 1/200 [00:06<22:23,  6.75s/it]WARNING:cellmap_data.dataloader:Worker failed to get item: cannot pickle '_queue.SimpleQueue' object, falling back to main thread
WARNING:cellmap_data.dataloader:Worker failed to get item: cannot pickle '_queue.SimpleQueue' object, falling back to main thread
Training:   1%|          | 2/200 [00:07<11:16,  3.42s/it]WARNING:cellmap_data.dataloader:Worker failed to get item: cannot pickle '_queue.SimpleQueue' object, falling back to main thread
WARNING:cellmap_data.dataloader:Worker failed to get item: cannot pickle '_queue.SimpleQueue' object, falling back to main thread
Training:   2%|▏         | 3/200 [00:08<07:40,  2.34s/it]WARNING:cellmap_data.dataloader:Worker failed to get item: cannot pickle '_queue.SimpleQueue' object, falling back to main thread
WARNING:cellmap_data.dataloader:Worker failed to get item: cannot pickle '_queue.SimpleQueue' object, falling back to main thread

```
Not only does it give a warning for every training step, but the bottleneck remains and there is no speed increase.

The environment settings are the same as in #175.

You can replicate the issue with the simple script below:
[minimal_script_pure_pytorch.py](https://github.com/user-attachments/files/23785449/minimal_script_pure_pytorch.py)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lack of multi-worker dataloading severly hinders the training speed #176

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Lack of multi-worker dataloading severly hinders the training speed #176

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions