-
Notifications
You must be signed in to change notification settings - Fork 561
Refactored the distributed_eval from the contrib package #1382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ainer and as it was
| normalize_args={}, | ||
| cellpose_eval_args={}, | ||
| worker_logs_dir=None, | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method only runs the distributed eval - no label merging. The reason for this is that there is a dask cluster resize between the eval and label merge, but in some packages where I want to use this I don't have the cluster resize option. Also I see the existing cluster annotation as a leaky abstraction - it says that you don't have to know anything about the cluster but you do and if you want it you need to pass me the proper parameters. So my option here is to make the dask Client argument as mandatory. If you don't want to distribute anything you can simply call the existing read_and_segment method
@GFleishman, @carsen-stringer , @mrariden
The purpose of this PR is to refactor the distributed_eval method so that I can use the code from this package directly in the command line tools package used by Janelia's Nextflow cellpose pipeline, which currently pretty much duplicate this code.
The main change was to no longer create the temporary zarr inside the distributed_eval but instead create it in the caller and pass it to the method.