-
Notifications
You must be signed in to change notification settings - Fork 152
Description
What you would like to be added?
Enhance the PodGrouper plugin matching logic so that, after resolving the full owner reference chain for a pod (for example, Pod → Job → JobSet → Kubeflow trainer), the plugin is not selected solely based on the topmost owner type with an immediate fallback to the default plugin when no match is found. Instead, PodGrouper should iterate over the owner chain, starting from the topmost owner and moving down towards the pod, attempting to match a plugin for each owner type in order, using the existing plugin hub implementation (for example, the Job plugin registration in hub.go). Only if no plugin is found for any owner in the chain should the default plugin be used, so that existing workload-specific plugins continue to drive PodGroup creation even when new intermediate owner types (such as JobSet) are introduced without a dedicated plugin.
Useful references:
PodGrouper main flow and owner resolution: pkg/podgrouper/podgrouper/podgrouper.go.
Plugin hub: pkg/podgrouper/podgrouper/hub/hub.go
PodGrouper developer docs: docs/developer/pod-grouper.md
Why is this needed?
With the current behavior, extending RBAC and ownership chains to include new types (e.g., JobSet) can unintentionally degrade grouping semantics, because PodGrouper climbs to the new topmost owner, fails to match a plugin, and immediately falls back to the default per-pod grouping instead of leveraging an already available, more suitable plugin for an intermediate owner (such as batch/v1 Job). An owner-chain–aware fallback mechanism makes PodGrouper more robust to incremental integrations, prevents regressions in gang scheduling behavior when new workload types are added, and ensures that existing plugins continue to be applied wherever they can accurately model the desired scheduling semantics until dedicated plugins for new owners are implemented.