This issue tracks work on adding OptimizationJob CRD support as discussed in https://github.com/kubeflow/sdk/tree/main/docs/proposals/46-hyperparameter-optimization#potential-api-for-optimizationjob-crd that focuses on hyperparameter optimization (HPO) for TrainJobs
It should include -
- CRD specifically for hyperparameter optimization of TrainJobs
- Integration with TrainJob API
- Support for model/dataset initialization shared across trials
- Support push-based metrics collection via SDK
- Integration with SDK's
OptimizerClient API
Design Document
https://docs.google.com/document/d/1Y8IJ-UdZ7VCEAlax_xEFbbqEi7EB6SfIX4D7ua-xn4M/edit
cc @andreyvelich @kramaranya
This issue tracks work on adding OptimizationJob CRD support as discussed in https://github.com/kubeflow/sdk/tree/main/docs/proposals/46-hyperparameter-optimization#potential-api-for-optimizationjob-crd that focuses on hyperparameter optimization (HPO) for TrainJobs
It should include -
OptimizerClientAPIDesign Document
https://docs.google.com/document/d/1Y8IJ-UdZ7VCEAlax_xEFbbqEi7EB6SfIX4D7ua-xn4M/edit
cc @andreyvelich @kramaranya