Skip to content

Conversation

@hugokallander
Copy link

@hugokallander hugokallander commented Dec 16, 2025

  • Added ensure_3_channels_batch
  • Added validation performance logging during training
  • Made test, instead of prod, deployment default. TODO: replace ri-scale workspace with bioimage-io after given access

@hugokallander hugokallander changed the title Add ensure_3_channels_batch to cellpose Small Cellpose improvements Dec 16, 2025
tn += int(torch.sum(~pr & ~gt).item())
_ = batch_metrics # keep for readability; not used further
except Exception as e:
# Metrics are best-effort; never fail training because of them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what condition would it fail? I would actually remove the except and let error propagate, so we have a chance to see the error and fix them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would maybe swap the name, and keep manifest.yaml for the prod, and manifest-test.yaml for dev.

}

@schema_method(arbitrary_types_allowed=True)
async def export_model(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking it would be nice to include the training loss/metric history as a json file uploaded to model artifact when calling export_model so we can keep track of the training. Similarily we should perhaps also serialize the training parameters as a json and upload to the same artifact.

Copy link
Contributor

@oeway oeway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick review, the changes looks good to me! I assume you have already tested it?

I made a few comments, would be nice if you can take a stab on them. Thanks!

total_batches = (nimg_per_epoch + batch_size - 1) // batch_size
batch_loss_per_sample = loss.item() # Loss per sample for this batch
batch_callback(iepoch + 1, batch_idx, total_batches, batch_loss_per_sample, elapsed)
batch_callback(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, but would it make sense to let batch_callback also carry another argument for the test metrics? and can be None for training but with value for validation? does that make sense? I am not sure though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants