Something to be explored: copy the pre-built conda environment from a shared location to each node's local storage at the start of each job. In principle, running the environment locally on each node should avoid the bottlenecks associated with shared filesystems.