Skip to content

Commit 9faf2f8

Browse files
committed
[utils] Rewrite slurm.pl from scratch
The new file calls sbatch passing the batch file on stdin, and waits for completion of the script without polling, rather by passing the --wait switch to sbatch. Slurm has a hardcoded polling schedule for waiting. Specifically, it checks if the job has completed for 2s, then increases the wait time by 2s until it maxes out at 10s. Because of this, even very short Kaldi batches often incur extra 5s wait delays on average. The following patch reduces the poll time to 1s without growth: https://github.com/burrmill/burrmill/blob/v0.5-beta.2/lib/build/slurm/sbatch.19.patch It has been tested to apply cleanly up until Slurm v20.1, and is unlikely to break in future. Please open a ticket in the https://github.com/burrmill/burrmill repository if your Slurm source does not patch. You do not need administrative access to the cluster; you can just build your own version of sbatch and place on the PATH. Internally it uses only Slurm RPC calls, and does not require many dependencies.
1 parent e5cb693 commit 9faf2f8

File tree

1 file changed

+330
-567
lines changed

1 file changed

+330
-567
lines changed

0 commit comments

Comments
 (0)