Skip to content

Set RuntimePatch.Time field automatically during admission#3319

Open
astefanutti wants to merge 1 commit intokubeflow:masterfrom
astefanutti:pr-time-webhook
Open

Set RuntimePatch.Time field automatically during admission#3319
astefanutti wants to merge 1 commit intokubeflow:masterfrom
astefanutti:pr-time-webhook

Conversation

@astefanutti
Copy link
Contributor

What this PR does / why we need it:

This PR adds a mutating webhook so the time fields in runtime patches are automatically set.

Checklist:

  • Docs included if any changes are user facing

Copilot AI review requested due to automatic review settings March 12, 2026 18:34
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from astefanutti. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@github-actions
Copy link

Pull Request Test Coverage Report for Build 23017880989

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 39 of 55 (70.91%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 59.569%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/runtime/core/trainingruntime.go 0 3 0.0%
pkg/runtime/framework/core/framework.go 3 9 33.33%
pkg/webhooks/trainjob_webhook.go 14 21 66.67%
Totals Coverage Status
Change from base Build 23014250484: 0.2%
Covered Lines: 1771
Relevant Lines: 2973

💛 - Coveralls

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a mutating admission webhook path for TrainJob and wires runtime-level defaulting so spec.runtimePatches[*].time is automatically set on create and selectively updated on patch changes.

Changes:

  • Introduces a TrainJobDefaulter webhook that calls into the selected runtime to apply defaults on create/update.
  • Extends the runtime/framework plugin system with CustomDefaulterPlugin and implements timestamp defaulting in the PlainML plugin.
  • Adds integration/e2e/unit tests to validate RuntimePatch.Time defaulting and update behavior.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/integration/webhooks/trainjob_test.go Adds an integration test asserting RuntimePatch.Time is set on create.
test/e2e/e2e_test.go Extends e2e coverage to check Time is set and selectively updated on patch change.
pkg/webhooks/trainjob_webhook_test.go Adds unit tests for defaulter behavior across create/update scenarios.
pkg/webhooks/trainjob_webhook.go Adds a mutating defaulter webhook and hooks it into webhook setup.
pkg/runtime/interface.go Extends runtime.Runtime with DefaultObjects for admission-time defaulting.
pkg/runtime/framework/plugins/plainml/plainml_test.go Adds unit tests for patch timestamp defaulting logic in PlainML plugin.
pkg/runtime/framework/plugins/plainml/plainml.go Implements runtime patch timestamp defaulting and change-detection behavior.
pkg/runtime/framework/interface.go Introduces CustomDefaulterPlugin interface.
pkg/runtime/framework/core/framework_test.go Updates framework tests to account for custom defaulter plugin collection.
pkg/runtime/framework/core/framework.go Collects/runs CustomDefaulterPlugins via RunCustomDefaulterPlugins.
pkg/runtime/core/trainingruntime.go Implements DefaultObjects by invoking framework custom defaulters.
manifests/base/webhook/manifests.yaml Adds a MutatingWebhookConfiguration for TrainJob defaulting.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +49 to +54
now := metav1.Now()

if oldObj == nil {
for i := range newObj.Spec.RuntimePatches {
newObj.Spec.RuntimePatches[i].Time = &now
}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default() assigns &now to every RuntimePatch, so all entries share the same *metav1.Time pointer; prefer creating a per-patch copy (or calling metav1.Now() per assignment) to avoid surprising aliasing if any code mutates the Time value in place later.

Copilot uses AI. Check for mistakes.
oldCmp, newCmp := old, *patch
oldCmp.Time, newCmp.Time = nil, nil
if equality.Semantic.DeepEqual(oldCmp, newCmp) {
patch.Time = old.Time
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On UPDATE, if a patch is unchanged and old.Time is nil (e.g., TrainJobs created before this webhook existed), the code preserves nil and leaves Time unset; if the intent is that RuntimePatch.Time is always populated after admission, consider defaulting to "now" when old.Time is nil even for unchanged patches.

Suggested change
patch.Time = old.Time
if old.Time != nil {
patch.Time = old.Time
} else {
patch.Time = &now
}

Copilot uses AI. Check for mistakes.
Comment on lines +452 to +455
g.Expect(gotTrainJob.Spec.RuntimePatches[0].Time.Equal(unchangedTime)).To(gomega.BeTrue())
// Changed patch: Time must be updated (different from original).
g.Expect(gotTrainJob.Spec.RuntimePatches[1].Time).ShouldNot(gomega.BeNil())
g.Expect(gotTrainJob.Spec.RuntimePatches[1].Time.Equal(changedTime)).To(gomega.BeFalse())
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The e2e assertion that the changed patch's Time is strictly different from the original can be flaky on platforms with coarse time resolution (create+update may occur in the same tick); consider ensuring the update happens after at least one clock tick (or asserting non-decreasing time plus a separate signal that defaulting ran).

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +28
kind: MutatingWebhookConfiguration
metadata:
name: mutating-webhook-configuration
webhooks:
- admissionReviewVersions:
- v1
clientConfig:
service:
name: webhook-service
namespace: system
path: /mutate-trainer-kubeflow-org-v1alpha1-trainjob
failurePolicy: Fail
name: defaulter.trainjob.trainer.kubeflow.org
rules:
- apiGroups:
- trainer.kubeflow.org
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- trainjobs
sideEffects: None
---
apiVersion: admissionregistration.k8s.io/v1
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MutatingWebhookConfiguration uses the placeholder service name/namespace (webhook-service/system), but the kustomize patching in this repo currently targets only ValidatingWebhookConfiguration, so this mutating webhook will likely point to a non-existent service in rendered manifests and never run.

Suggested change
kind: MutatingWebhookConfiguration
metadata:
name: mutating-webhook-configuration
webhooks:
- admissionReviewVersions:
- v1
clientConfig:
service:
name: webhook-service
namespace: system
path: /mutate-trainer-kubeflow-org-v1alpha1-trainjob
failurePolicy: Fail
name: defaulter.trainjob.trainer.kubeflow.org
rules:
- apiGroups:
- trainer.kubeflow.org
apiVersions:
- v1alpha1
operations:
- CREATE
- UPDATE
resources:
- trainjobs
sideEffects: None
---
apiVersion: admissionregistration.k8s.io/v1

Copilot uses AI. Check for mistakes.
Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @astefanutti!

@@ -1,5 +1,31 @@
---
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -1,5 +1,31 @@
---
apiVersion: admissionregistration.k8s.io/v1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also update Helm chart please?

if !ok {
return nil
}
return rt.DefaultObjects(ctx, oldObj, newObj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, do we really need to default manager as part of PlainML policy?
I was thinking that we always set it to the Current Time if that is not set.

I agree that in the future we can extend the Runtime Framework to support customDefaulterPlugins when plugins should inject default values, but that is not needed here.

It is like similar to how we validate Deprecated Runtimes directly in the the webhook: https://github.com/astefanutti/trainer/blob/a44a7c98b20a7b0c5fd89fc937340e291731b9fa/pkg/webhooks/clustertrainingruntime_webhook.go#L49

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I initially started by adding the logic in the webhook. The plugin interfaces are not a perfect fit for TrainJob admission. I'll revert back to it.


var _ admission.Defaulter[*trainer.TrainJob] = (*TrainJobDefaulter)(nil)

func (d *TrainJobDefaulter) Default(ctx context.Context, newObj *trainer.TrainJob) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just this, as we discussed here: kubernetes-sigs/jobset#1159 (comment)

Suggested change
func (d *TrainJobDefaulter) Default(ctx context.Context, newObj *trainer.TrainJob) error {
func (d *TrainJobDefaulter) Default(ctx context.Context, trainJob *trainer.TrainJob) error {

oldTime := metav1.NewTime(time.Date(2025, 1, 1, 0, 0, 0, 0, time.UTC))

cases := map[string]struct {
operation admissionv1.Operation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need this?
You can just check if oldObj == nil, set operation == Update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants