Skip to content

Releases: NVIDIA/KAI-Scheduler

v0.12.5

13 Jan 13:58
c83ecf3

Choose a tag to compare

What's Changed

Added

  • Added label and annotations propagation when using skiptopowner Podgrouper plugin

Full Changelog: v0.12.4...v0.12.5

v0.12.4

07 Jan 09:16
a1d1b42

Choose a tag to compare

What's Changed

Fixed

  • Fixed GPU memory pods Fair Share and Queue Order calculations

Full Changelog: v0.12.3...v0.12.4

v0.9.11

07 Jan 09:24
24034ff

Choose a tag to compare

What's Changed

Fixed

  • Fixed GPU memory pods Fair Share and Queue Order calculations

Full Changelog: v0.9.10...v0.9.11

v0.6.16

07 Jan 09:26
3318f00

Choose a tag to compare

What's Changed

Fixed

  • Fixed a bug where the scheduler would not re-try updating podgroup status after failure
  • GPU Memory pods are not reclaimed or consolidated correctly
  • Fixed GPU memory pods Fair Share and Queue Order calculations

Full Changelog: v0.6.15...v0.6.16

v0.4.17

07 Jan 09:27
05c1bc9

Choose a tag to compare

What's Changed

Fixed

  • Fixed a bug where the scheduler would not re-try updating podgroup status after failure
  • GPU Memory pods are not reclaimed or consolidated correctly
  • Fixed GPU memory pods Fair Share and Queue Order calculations

Full Changelog: v0.4.16...v0.4.17

v0.13.0-rc0

05 Jan 11:19
a3aa12a

Choose a tag to compare

v0.13.0-rc0 Pre-release
Pre-release

What's Changed

Added

  • Added the option to disable prometheus service monitor creation #810 itsomri
  • Fixed prometheus instance deprecation - ensure single instance #779 itsomri
  • Added clear error messages for jobs referencing missing or orphan queues, reporting via events and conditions #820 gshaibi
  • Added rule selector for resource accounting prometheus #818 itsomri
  • Made accounting labels configurable #818 itsomri
  • Added support for Grove hierarchical topology constraints in PodGroup subgroups

Fixed

  • Fixed pod controller logging to use request namespace/name instead of empty pod object fields when pod is not found
  • Fixed a bug where topology constrains with equal required and preferred levels would cause preferred level not to be found.
  • Fixed GPU memory pods Fair Share and Queue Order calculations
  • Interpret negative or zero half-life value as disabled #818 itsomri

Full Changelog: v0.12.0...v0.13.0-rc0

v0.12.3

05 Jan 12:19
dbd1409

Choose a tag to compare

What's Changed

Fixed

  • Interpret negative or zero half-life value as disabled #832 itsomri

Full Changelog: v0.12.2...v0.12.3

v0.12.2

01 Jan 08:48
8ba630c

Choose a tag to compare

What's Changed

Added

  • Fixed prometheus instance deprecation - ensure single instance #818 itsomri
  • Added rule selector for resource accounting prometheus #825 itsomri
  • Made accounting labels configurable #825 itsomri

Full Changelog: v0.12.1...v0.12.2

v0.9.10

01 Jan 07:06
2432ff2

Choose a tag to compare

What's Changed

Fixed

  • Fixed a bug where the scheduler would not consider topology constraints when calculating the scheduling constraints signature #761 gshaibi

Full Changelog: v0.9.9...v0.9.10

v0.12.1

25 Dec 15:37
788b219

Choose a tag to compare

What's Changed

  • Added the option to disable prometheus service monitor creation #810 itsomri

Full Changelog: v0.12.0...v0.12.1