Skip to content

xpk should fail fast if a JobSet or PathwaysJob will result in an invalid value for the jobset.sigs.k8s.io/coordinator label #750

@GiuseppeTT

Description

@GiuseppeTT

JobSet's admission webhook now rejects requests to create JobSet objects that would result in an invalid value for the jobset.sigs.k8s.io/coordinator label. In most cases, this is equivalent to limiting the length of the JobSet name when the coordinator feature is enabled. See kubernetes-sigs/jobset#1056 and kubernetes-sigs/jobset#1079.

Following this fail-fast principle, xpk should also preemptively fail commands that would lead to this invalid state.

For xpk commands that create a JobSet object directly, the error from the JobSet admission webhook can likely just be bubbled up to the user.

The core problem arises when xpk creates a PathwaysJob object. Since the PathwaysJob controller does not have an admission webhook, the creation request succeeds. However, the PathwaysJob controller will then try to create a child JobSet at runtime and continuously fail because the JobSet webhook will block the invalid request. This results in a confusing, difficult-to-debug failure loop for the user, as the initial xpk command appeared to succeed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions