-
Notifications
You must be signed in to change notification settings - Fork 926
Pull requests: kubeflow/trainer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(ci): Generate valid release version for Python package
size/XS
#3333
opened Mar 13, 2026 by
andreyvelich
Loading…
fix(examples): Verify TrainJob Completion
lgtm
size/XXL
#3331
opened Mar 13, 2026 by
andreyvelich
Loading…
feat: customize R generation to include GPUs
size/M
#3325
opened Mar 13, 2026 by
vsoch
Loading…
1 task
feat(api): Add terminationGracePeriodSeconds to PodSpecPatch in TrainJob
size/L
#3324
opened Mar 13, 2026 by
krishdef7
Loading…
1 task
fix(ci): re-enable XGBoost E2E test
kind/cleanup
size/XS
#3323
opened Mar 13, 2026 by
Krishna-kg732
Loading…
revert: back to a10-1 for gpu e2e
size/XS
#3322
opened Mar 13, 2026 by
jaiakash
Loading…
1 task done
Set RuntimePatch.Time field automatically during admission
size/XL
#3319
opened Mar 12, 2026 by
astefanutti
Loading…
1 task done
feat(operator): support Pod recreation after scheduler preemption
size/L
#3316
opened Mar 12, 2026 by
dafu-wu
Loading…
1 task
feat(runtime): support Image and Command in PodSet Container
size/S
#3312
opened Mar 11, 2026 by
Raakshass
Loading…
5 tasks done
KEP-3310: Shared initializer support for train jobs
size/XL
#3311
opened Mar 11, 2026 by
akshaychitneni
Loading…
1 task
feat(docs): add dedicated documentation website
do-not-merge/work-in-progress
size/XXL
#3308
opened Mar 11, 2026 by
Sridhar1030
•
Draft
4 of 9 tasks
fix(initializer): add missing glob wildcard to .pt and .pth ignore p…
size/S
#3307
opened Mar 11, 2026 by
ghazariann
Loading…
1 task
fix(runtimes): add validation for LoRA multi-node and immutable trainer args
size/M
#3302
opened Mar 10, 2026 by
krishdef7
Loading…
1 task done
fix: remove unnecessary setcap CAP_NET_BIND_SERVICE from MPI runtime docker file
size/S
#3286
opened Mar 9, 2026 by
kapil27
Loading…
feat: support multiple replicas for non-trainer replicatedJobs
size/L
#3284
opened Mar 6, 2026 by
krishdef7
Loading…
fix(operator): surface reconciliation errors in TrainJob status
size/L
#3282
opened Mar 5, 2026 by
krishdef7
Loading…
feat: merge gpu-cluster setup into single e2e script via flag
ok-to-test
size/L
#3281
opened Mar 4, 2026 by
aniket2405
Loading…
1 task
docs: add KEP-2839 Dynamic LLM Trainer Framework proposal
do-not-merge/work-in-progress
size/XL
#3263
opened Feb 27, 2026 by
NarayanaSabari
•
Draft
chore: automate release process with GitHub Actions
size/XL
#3261
opened Feb 27, 2026 by
Krishna-kg732
Loading…
feat: add e2e tests to Helm CI workflow
do-not-merge/hold
size/L
#3253
opened Feb 24, 2026 by
Goku2099
Loading…
feat(docs): Kubeflow Trainer ROADMAP 2026
do-not-merge/hold
size/M
#3242
opened Feb 24, 2026 by
andreyvelich
Loading…
feat: add framework-level validation for conflicting env vars across …
size/L
#3237
opened Feb 23, 2026 by
krishdef7
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.