Open
Description
Runs die partway through the DeepProfiler section after 2-3 new jobs picked up. I suspect the issue is that DeepProfiler is not releasing the GPU somehow. If we're batching at a larger level (ie plate), this is probably fine because we can have one machine per batch, but it's far from ideal.
[ ] Investigate more clearly if it's always failing at the exact same place to see if that gives clues
[ ] See if it's something we can fix on DeepProfiler's side, that would be ideal
[ ] Otherwise, see if we can add a subprocess command to somehow release the GPU