Commit 0972e98
committed
feat(validator): add Kubeflow Trainer support to robust-controller check
The robust-controller conformance check previously only validated the
Dynamo operator, causing it to skip on all training clusters. This adds
Kubeflow Trainer as an alternative target, selected based on recipe
component presence:
- dynamo-platform in recipe → validate Dynamo operator
- kubeflow-trainer in recipe → validate Kubeflow Trainer
- neither → skip
Kubeflow Trainer validation checks:
1. Controller deployment running (kubeflow-trainer-controller-manager)
2. Validating webhook operational with reachable endpoint
3. TrainJob CRD exists (trainjobs.trainer.kubeflow.org)
4. Webhook rejects invalid TrainJob (behavioral test)
Refactored the original Dynamo validation into checkRobustDynamo() and
renamed validateWebhookRejects to validateDynamoWebhookRejects for
clarity.1 parent f1c915b commit 0972e98
File tree
3 files changed
+447
-40
lines changed- recipes/validators
- validators/conformance
3 files changed
+447
-40
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
125 | | - | |
| 125 | + | |
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
| |||
0 commit comments