-
Notifications
You must be signed in to change notification settings - Fork 233
Description
Describe the bug
On Windows platforms, hosts cannot pick up jobs due to available processors incorrect detection.
The following error is constantly raised:
CoreReservationFailureException: Not launching, insufficient hyperthreading cores to reserve based on frameCores (0 < 36)
Details:
The reserveHT method of the RQD Machine class relies on a __procs_by_physid_and_coreid attribute that is not correctly filled when on a Windows platform. It is actually only filled when on Linux.
See:
reserveHTrelying on__procs_by_physid_and_coreid: https://github.com/AcademySoftwareFoundation/OpenCue/blob/master/rqd/rqd/rqmachine.py#L842- the line where the error is raised: https://github.com/AcademySoftwareFoundation/OpenCue/blob/master/rqd/rqd/rqmachine.py#L857
__procs_by_physid_and_coreidbeing filled only if on a Linux platform: https://github.com/AcademySoftwareFoundation/OpenCue/blob/master/rqd/rqd/rqmachine.py#L613
To Reproduce
- Submit a job so it is dispatched to a machine running on Windows.
- Job has to have non fractional cores (the regular usage, nothing fancy here), else the
reserveHTis skipped.
Expected behavior
The job should be picked up without raising a CoreReservationFailureException error.
Version Number
Spotted on 0.22.0. But looking at the current state of master, it seems nothing has changed since.
Additional context
Relates to #1171
A fix is already being addressed : #1468