Description
There is a limit on how many requests will actually be pulling jobs from the job queue at a given point in time. Other requests will be kept pending, but even if there are jobs ready to be worked on, they will not see them.
With the current model, each worker requests two jobs in parallel: depsolve jobs and osbuild jobs. This makes it possible that there are depsolve jobs in the queue, but no osbuild jobs, but only requests for osbulid jobs are getting through, meaning nothing happens. We do retry job requests after a brief timeout, so we are likely to make progress eventually, but as it is now we are seeing significant delays.
One way to mitigate this situation might be to change the way the workers requests jobs to have only one request at a given time, but make the job types we request dynamic, depending on what jobs are currently running. We should still run depsolve jobs and osbuild jobs in parallel.
cc @Gundersanne @ondrejbudai