Commit d1848ce
committed
feat(api): Add terminationGracePeriodSeconds to PodSpecPatch in TrainJob
Adds terminationGracePeriodSeconds field to PodSpecPatch so users can
configure pod termination grace period per TrainJob via RuntimePatches.
This is needed for distributed training with PyTorch Elastic (torchrun)
where large models (70B+ parameters) require more than the default 30s
to complete JIT checkpointing before SIGKILL on node drain or TrainJob
pause.
No changes to merge logic in trainingruntime.go are required since the
existing StrategicMergePatch applied at batchv1.JobTemplateSpec level
already handles this field automatically.
Closes #3285
Signed-off-by: krishdef7 <gargkrish06@gmail.com>1 parent 941f4a2 commit d1848ce
File tree
11 files changed
+198
-2
lines changed- api
- openapi-spec
- python_api/kubeflow_trainer_api/models
- charts/kubeflow-trainer/crds
- manifests/base/crds
- pkg
- apis/trainer/v1alpha1
- client/applyconfiguration/trainer/v1alpha1
- util/testing
- test/integration
- controller
- webhooks
11 files changed
+198
-2
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 4 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3122 | 3122 | | |
3123 | 3123 | | |
3124 | 3124 | | |
| 3125 | + | |
| 3126 | + | |
| 3127 | + | |
| 3128 | + | |
| 3129 | + | |
| 3130 | + | |
| 3131 | + | |
| 3132 | + | |
3125 | 3133 | | |
3126 | 3134 | | |
3127 | 3135 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3122 | 3122 | | |
3123 | 3123 | | |
3124 | 3124 | | |
| 3125 | + | |
| 3126 | + | |
| 3127 | + | |
| 3128 | + | |
| 3129 | + | |
| 3130 | + | |
| 3131 | + | |
| 3132 | + | |
3125 | 3133 | | |
3126 | 3134 | | |
3127 | 3135 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
432 | 432 | | |
433 | 433 | | |
434 | 434 | | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
435 | 442 | | |
436 | 443 | | |
437 | 444 | | |
| |||
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
Lines changed: 12 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
338 | 338 | | |
339 | 339 | | |
340 | 340 | | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
341 | 350 | | |
342 | 351 | | |
343 | 352 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
301 | 301 | | |
302 | 302 | | |
303 | 303 | | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
304 | 377 | | |
305 | 378 | | |
306 | 379 | | |
| |||
0 commit comments