-
Notifications
You must be signed in to change notification settings - Fork 2
Small Cellpose improvements #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
bioengine_apps/cellpose_finetuning/training_example_runs/training_metrics_d679cca1.tsv
Outdated
Show resolved
Hide resolved
bioengine_apps/cellpose_finetuning/scripts/redeploy_cellpose.py
Outdated
Show resolved
Hide resolved
| tn += int(torch.sum(~pr & ~gt).item()) | ||
| _ = batch_metrics # keep for readability; not used further | ||
| except Exception as e: | ||
| # Metrics are best-effort; never fail training because of them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what condition would it fail? I would actually remove the except and let error propagate, so we have a chance to see the error and fix them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would maybe swap the name, and keep manifest.yaml for the prod, and manifest-test.yaml for dev.
| } | ||
|
|
||
| @schema_method(arbitrary_types_allowed=True) | ||
| async def export_model( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking it would be nice to include the training loss/metric history as a json file uploaded to model artifact when calling export_model so we can keep track of the training. Similarily we should perhaps also serialize the training parameters as a json and upload to the same artifact.
oeway
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick review, the changes looks good to me! I assume you have already tested it?
I made a few comments, would be nice if you can take a stab on them. Thanks!
| total_batches = (nimg_per_epoch + batch_size - 1) // batch_size | ||
| batch_loss_per_sample = loss.item() # Loss per sample for this batch | ||
| batch_callback(iepoch + 1, batch_idx, total_batches, batch_loss_per_sample, elapsed) | ||
| batch_callback( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure, but would it make sense to let batch_callback also carry another argument for the test metrics? and can be None for training but with value for validation? does that make sense? I am not sure though.
Uh oh!
There was an error while loading. Please reload this page.