-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Labels
questionFurther information is requestedFurther information is requested
Description
According to https://github.com/basnijholt/adaptive-scheduler/blob/master/adaptive_scheduler/scheduler.py#L594
the automatic requeing by slurm is disabled in adaptive-scheduled jobs. I ran into an issue, where the node that was hosting the job faltered and the job hung in preparation state for a while (50 min). I was able to fix it by requeing the job (one can override --no-requeue with scontrol later), and adaptive-scheduler happily picked up the job and showed it as running.
So, I was wondering what's the reason behind forced --no-requeue?
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested