You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The --no-detect-resources flag of the hq worker start command has been removed.
You can now configure automatically detected resources in a granular way using the new --detect-resources flag (see below). --no-detect-resources corresponds to --detect-resources=none.
New features
New policy tight (and tight!) that is the original implementation of compact.
It selects minimal number of resources groups and then tries to get maximum resources from
a biggest group and then maximum resources from the second biggest group, etc.
The policy compact now behaves as is described in the section "Changes".
Resource policy compact! is now allowed to take fractional resource request.
New command hq alloc cat <alloc-id> <stdout/stderr>, which can be used
to debug the output of allocations submitted by the automatic allocator.
New command hq server wait that repeatedly tries to connect to a server with a configurable timeout.
This is useful for deployment scripts that need to wait for server availability.
New hq alloc add parameter called --wrap-worker-cmd. It can be used to start
workers on allocated nodes using some wrapping mechanism (e.g. Podman).
New flag --detect-resources for hq worker start. It can be used to configure which worker resources
will be automatically detected. You can e.g. say --detect-resources=cpus,gpus/nvidia.
See documentation for more information.
The scheduler has better compacting behavior when there are small number of tasks
and workers appearing/disappering
Autoallocator keeps log file when probing allocation fails.
Unstable: Resource "coupling".
You may specify that some resources are coupled, e.g. cpus and gpus.
That means that cpus are gpus are organized in numa nodes, and allocation strategy
will respect that, i.e., it tries to find cpus and gpus from the same numa nodes.
Note: The current implementation does not detect coupling automatically,
you have to specify it manually.
Changes
Allocation policy compact was updated.
It still tries to find the minimal number of the resource groups,
but when they are found, resources are evenly taken from the selected groups.
In rare cases when you need original behavior, use new policy tight.
It is not a breaking change, because the compact previously did not specified how
exactly will be resources taken from groups.
Worker process terminated because of idle timeout now returns zero exit code.
Fixes
Fixed the issue of possible ignoring idle timeout when time request is used.
Fixes broken streaming when job file is used.
Fixed missing fields in export of journal into JSON
Artifact summary:
hq-v0.24.0-*: Main HyperQueue build containing the hq binary. Download this archive to
use HyperQueue from the command line.
hyperqueue-0.24.0-*: Wheel containing the hyperqueue package with HyperQueue Python
bindings.