Skip to content

Very slow scheduling #94

@rimorob

Description

@rimorob

I am sorry to not have a reproducible example yet. My code base is very large and was running just fine until the job size became small. So I'll make a reproducible example after hearing some suggestions on what to test. In my case, the jobs are rather small - 10s each. The problem I'm seeing is they don't get scheduled very quickly. In fact, at any given time only one slurm job, or at most two, are running (the machine they are running on can run ~15 jobs by ram and cpu requirements). I'm trying to run 60 chunks, and to resolve this problem I set scheduling to 5, which did bump up the number of running jobs to 2-3. However, the main problem is that the chunks seem to take 10-15 seconds to launch, and I don't know what I changed - a few days ago, with larger jobs - this was not the case. So to my specific questions, before I try to generate a small reproducible example:

  1. Is there a verbose/trace flag? I can't see it in the documentation
  2. Is there anything to pay attention to beyond the "scheduling" option? Any hunch for what may be causing this issue for me to start looking at in generating an example?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions