Skip to content

Conversation

@Electronic-Waste
Copy link
Member

I converted the issue kubeflow/katib#2340 into this blog.

PTAL when you have time @kubeflow/wg-automl-leads @terrytangyuan

Copy link
Member

@andreyvelich andreyvelich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this @Electronic-Waste, and sorry for the late reply!

/assign @varodrig @hbelmiro @franciscojavierarceo @kubeflow/wg-training-leads @akgraner
Please can you help us with the review, so we can merge this great blog post ?

Copy link
Contributor

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Thank you @Electronic-Waste!! I left some small nits but otherwise lgtm.

Copy link

@varodrig varodrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great post! thank you so much for working on this. It seems you had a fantastic experience and you are sharing your thoughts with the community. Great story.
I left a couple of suggestions.

Electronic-Waste and others added 10 commits February 27, 2025 09:58
Signed-off-by: Electronic-Waste <[email protected]>
@Electronic-Waste
Copy link
Member Author

@franciscojavierarceo @franciscojavierarceo Thanks so much for your detailed reviews! Appreciate your patience and kindness.

I've addressed all of your comments. PTAL if you have time:)

@varodrig
Copy link

/lgtm

Copy link
Contributor

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@google-oss-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: franciscojavierarceo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [franciscojavierarceo]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 858e917 into kubeflow:master Feb 27, 2025
7 checks passed
@Electronic-Waste Electronic-Waste deleted the doc/gsoc branch February 27, 2025 03:05

The current implementation of Metrics Collector is pull-based, raising design problems such as determining the frequency at which we scrape the metrics, performance issues like the overhead caused by too many sidecar containers, and restrictions on developing environments that must support sidecar containers and admission webhooks. And also, for data scientists, they need to pay attention to the format of metrics printed in the training scripts, which is error prone and may be hard to recognize.

## Solution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Electronic-Waste It would be also nice if you could update this information with these:

  1. Briefly explain how the new push-based API works
  2. Link to the Kubeflow docs for the mode information
  3. Add you Notebook example: https://github.com/kubeflow/katib/blob/master/examples/v1beta1/sdk/mnist-with-push-metrics-collection.ipynb

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I'll add these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants