Releases: NVIDIA/KAI-Scheduler
Releases · NVIDIA/KAI-Scheduler
v0.12.5
What's Changed
Added
- Added label and annotations propagation when using
skiptopownerPodgrouper plugin
Full Changelog: v0.12.4...v0.12.5
v0.12.4
What's Changed
Fixed
- Fixed GPU memory pods Fair Share and Queue Order calculations
Full Changelog: v0.12.3...v0.12.4
v0.9.11
What's Changed
Fixed
- Fixed GPU memory pods Fair Share and Queue Order calculations
Full Changelog: v0.9.10...v0.9.11
v0.6.16
What's Changed
Fixed
- Fixed a bug where the scheduler would not re-try updating podgroup status after failure
- GPU Memory pods are not reclaimed or consolidated correctly
- Fixed GPU memory pods Fair Share and Queue Order calculations
Full Changelog: v0.6.15...v0.6.16
v0.4.17
What's Changed
Fixed
- Fixed a bug where the scheduler would not re-try updating podgroup status after failure
- GPU Memory pods are not reclaimed or consolidated correctly
- Fixed GPU memory pods Fair Share and Queue Order calculations
Full Changelog: v0.4.16...v0.4.17
v0.13.0-rc0
What's Changed
Added
- Added the option to disable prometheus service monitor creation #810 itsomri
- Fixed prometheus instance deprecation - ensure single instance #779 itsomri
- Added clear error messages for jobs referencing missing or orphan queues, reporting via events and conditions #820 gshaibi
- Added rule selector for resource accounting prometheus #818 itsomri
- Made accounting labels configurable #818 itsomri
- Added support for Grove hierarchical topology constraints in PodGroup subgroups
Fixed
- Fixed pod controller logging to use request namespace/name instead of empty pod object fields when pod is not found
- Fixed a bug where topology constrains with equal required and preferred levels would cause preferred level not to be found.
- Fixed GPU memory pods Fair Share and Queue Order calculations
- Interpret negative or zero half-life value as disabled #818 itsomri
Full Changelog: v0.12.0...v0.13.0-rc0