Skip to content

DRAFT: munging gpu_sharing & gpu_flexibility#28071

Draft
tehut wants to merge 7 commits into
mainfrom
munge
Draft

DRAFT: munging gpu_sharing & gpu_flexibility#28071
tehut wants to merge 7 commits into
mainfrom
munge

Conversation

@tehut

@tehut tehut commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

do not merge

Contributor Checklist

  • Changelog Entry If this PR changes user-facing behavior, please generate and add a
    changelog entry using the make cl command.
  • Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
    ensure regressions will be caught.
  • Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
    and job configuration, please update the Nomad product documentation, which is stored in the
    web-unified-docs repo. Refer to the web-unified-docs contributor guide for docs guidelines.
    Please also consider whether the change requires notes within the upgrade
    guide
    . If you would like help with the docs, tag the nomad-docs team in this PR.

Reviewer Checklist

  • Backport Labels Please add the correct backport labels as described by the internal
    backporting document.
  • Commit Type Ensure the correct merge method is selected which should be "squash and merge"
    in the majority of situations. The main exceptions are long-lived feature branches or merges where
    history should be preserved.
  • Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
    within the public repository.
  • If a change needs to be reverted, we will roll out an update to the code within 7 days.

Changes to Security Controls

Are there any changes to security controls (access controls, encryption, logging) in this pull request? If so, explain.

}
// Only set Count if not using FirstAvailable
if d.Count != nil {
if d.Count != nil && len(d.FirstAvailable) == 0 {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

port to #27391

tehut and others added 7 commits June 12, 2026 16:44
add will_share to task device block

fixes to get share working

make sure WillShare != nil

add helpers and test to exercise changes to createOffers

add test helpers to generate shared devices and allocations

add TestDeviceAccounter_AllocateAndReserveSharedDevices

add willShare functionality to deviceAccounter.AddReserved

fix bug that prevented reservation & add cases to TestDeviceChecker

add cases to rank tests

consolidate WillShare maps to AllocatedDeviceResource.WillShare

refactor Task.WillShare to Task.ShareDevices

fix comments

tidying

make plugin & structs  DeviceSharing struct match proto generated struct

simplify AddAllocs_Collision test and add nil check before using String() function in AddAllocs

fix unkeyed literal

Add dai.GetSharedByID helper and some refactor cleanup

TEMP: plugin helper

tidy DeviceSharing in structs.go to more closely mirror device/device.go

bring api.go DeviceSharing struct into line with plugins/device.go

replace GpuId with SharedDeviceId

update GPU_ID references

replace plugin device.DeviceSharing with Shared proto enum and cascade through api & structs fields

Pass Shared by value now that its a typed string

regression fix: not sure why i made the happy path unreachable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants