Skip to content

Commit 70a878c

Browse files
authored
Merge pull request #11 from Genentech/wdl-defaults
Removed internal defaults. Updated docs.
2 parents 42e8879 + c13d61d commit 70a878c

5 files changed

Lines changed: 51 additions & 30 deletions

File tree

docs/faq.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,15 @@ FAQ
1616
Please ensure TMPDIR has execution permissions or temporarily set TMPDIR to a directory that does::
1717

1818
TMPDIR=. pip install scallops
19+
20+
#. Unable to download model for Stardist or other deep learning models
21+
Some workflow environments (e.g. AWS HealthOmics) prohibit accesssing public websites.
22+
You need to download model files to an accessible location, and set the workflow input, `model_dir`
23+
to this location.
24+
25+
Example::
26+
27+
wget https://github.com/stardist/stardist-models/releases/download/v0.1/python_2D_versatile_fluo.zip
28+
aws s3 cp python_2D_versatile_fluo.zip s3://my-bucket/model/
29+
30+
In your OPS workflow input JSON, set `model_dir` to `s3://my-bucket/model/`

docs/workflows.rst

Lines changed: 31 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,9 @@ Workflow Steps
2424
3. **Stitching**:
2525

2626
* Applies the calculated flatfield to the raw tiles.
27-
* Corrects for radial distortion (can be disabled or `k` can be provided).
27+
* Corrects for radial distortion.
2828
* Aligns tiles using stage positions and cross-correlation.
29-
* Stitches tiles into a multi-scale OME-Zarr image.
29+
* Stitches tiles into an OME-Zarr image.
3030

3131
Inputs
3232
------
@@ -45,21 +45,25 @@ These are the absolute minimum parameters required to run the stitching workflow
4545
- Description
4646
* - **urls**
4747
- Array[String]
48-
- List of directories containing raw images (S3 or local paths).
48+
- List of directories containing raw images (e.g S3 URLs).
4949
* - **image_pattern**
5050
- String
5151
- Regex-like pattern to parse filenames (e.g., ``"Well{well}_Point{point}.nd2"``).
5252
* - **output_directory**
5353
- String
5454
- Base path for outputs.
55+
* - **docker**
56+
- String
57+
- Workflow docker image.
5558

5659
.. code-block:: json
5760
:caption: Minimal Stitching JSON
5861
5962
{
6063
"urls": ["s3://your-bucket/experiment_data/"],
6164
"image_pattern": "20231010_10x_6W_SBS_c{t}/plate{plate}/Well{well}_Point{skip}_{skip}_Channel{skip}_Seq{skip}.nd2",
62-
"output_directory": "s3://your-bucket/experiment_data/stitch/iss/"
65+
"output_directory": "s3://your-bucket/experiment_data/stitch/iss/",
66+
"docker":"772311241819.dkr.ecr.us-west-2.amazonaws.com/scallops:1.0.0"
6367
}
6468
6569
Full Parameter Reference (Advanced)
@@ -118,7 +122,7 @@ Below is the complete list of exposed options, including optional settings for g
118122
- Force re-run of stitching even if output exists.
119123
* - **Resources**
120124
- Various
121-
- ``stitch_cpu``, ``stitch_memory``, ``stitch_disks``, etc. can be set to override defaults.
125+
- ``stitch_cpu``, ``stitch_memory``, etc. can be set to override defaults.
122126

123127
Outputs
124128
-------
@@ -186,6 +190,9 @@ These are the absolute minimum parameters required to run the OPS workflow.
186190
* - **reads_labels**
187191
- String
188192
- Which segmentation label to assign reads to (e.g., ``"cell"`` or ``"nuclei"``).
193+
* - **docker**
194+
- String
195+
- Workflow docker image.
189196

190197
.. code-block:: json
191198
:caption: Minimal OPS JSON
@@ -196,7 +203,8 @@ These are the absolute minimum parameters required to run the OPS workflow.
196203
"phenotype_url": "s3://your-bucket/experiment/stitch/pheno/stitch/stitch.zarr/",
197204
"phenotype_dapi_channel": 4,
198205
"phenotype_cyto_channel": [6],
199-
"reads_labels": "cell"
206+
"reads_labels": "cell",
207+
"docker":"772311241819.dkr.ecr.us-west-2.amazonaws.com/scallops:1.0.0"
200208
}
201209
202210
Full Parameter Reference (Advanced)
@@ -225,9 +233,6 @@ Below is the complete list of exposed options covering registration, feature ext
225233
* - **subset**
226234
- Array[String]
227235
- Filter specific wells/plates.
228-
* - **batch_size**
229-
- Int
230-
- Number of groups to process in one batch.
231236

232237
**Segmentation & Registration**
233238

@@ -313,7 +318,7 @@ Below is the complete list of exposed options covering registration, feature ext
313318
- Array[Float]
314319
- Sigma for Laplacian of Gaussian spot detection.
315320

316-
**Control Flags (Force / Skip)**
321+
**Additional Parameters**
317322

318323
.. list-table::
319324
:widths: 30 15 55
@@ -322,19 +327,21 @@ Below is the complete list of exposed options covering registration, feature ext
322327
* - Parameter
323328
- Type
324329
- Description
325-
* - **run_spot_detect**
326-
- Boolean
327-
- Default ``true``.
328-
* - **run_nuclei_segmentation**
329-
- Boolean
330-
- Default ``true``.
331-
* - **run_cell_segmentation**
330+
* - **model_dir**
331+
- String
332+
- Path containing deep learning model resouces (See :doc:`FAQ <faq>` for more details.)
333+
* - **run_``task``**
332334
- Boolean
333-
- Default ``true``.
334-
* - **force_merge**
335+
- Set to ``false``, (e.g. run_nuclei_segmentation) to skip task
336+
* - **force_``task``**
335337
- Boolean
336-
- Force re-merge even if output exists.
337-
338+
- Set to ``true``, to re-run task (e.g. force_segment_cell) even if output exists.
339+
* - **Resources**
340+
- Various
341+
- ``segment_nuclei_cpu``, ``segment_nuclei_memory``, etc. can be set to override defaults.
342+
* - **batch_size**
343+
- Int
344+
- Number of groups to process in one batch.
338345
Outputs
339346
-------
340347

@@ -396,8 +403,7 @@ Create a JSON file (e.g., ``ops_input.json``) defining your inputs. Below is a m
396403
"segment_cell_threshold_correction_factor": 1.0,
397404
"cell_segmentation_extra_arguments": "--closing-radius 5",
398405
399-
"docker_registry": "123456789012.dkr.ecr.us-region-1.amazonaws.com",
400-
"docker_version": "latest"
406+
"docker": "123456789012.dkr.ecr.us-region-1.amazonaws.com/scallops:latest"
401407
}
402408
403409
Step 2: Run with miniwdl-omics-run
@@ -451,7 +457,7 @@ Suppose you only need to perform image registration without the full segmentatio
451457
String moving_image
452458
String fixed_image
453459
String output_dir
454-
String docker_registry
460+
String docker
455461
}
456462
457463
# Call the existing Scallops registration task
@@ -462,7 +468,7 @@ Suppose you only need to perform image registration without the full segmentatio
462468
transform_output_directory = output_dir + "/transforms",
463469
moving_output_directory = output_dir + "/registered_images",
464470
# Pass through required runtime parameters
465-
docker = docker_registry + "/scallops:latest",
471+
docker = docker,
466472
cpu = 4,
467473
memory = "16 GiB",
468474
# ... (other required inputs like zones, disks, etc.)

scallops/tests/test_wdl.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ def test_stitch_wdl_z_stack(tmp_path):
5555
"z_index": "focus",
5656
"stitch_radial_correction_k": "none",
5757
"output_directory": str(tmp_path / "out"),
58+
"docker": "",
5859
}
5960

6061
with open(tmp_path / "inputs.json", "wt") as out:
@@ -99,6 +100,7 @@ def test_stitch_wdl(tmp_path):
99100
"stitch_workflow.urls": [str(input_path)],
100101
"stitch_workflow.image_pattern": "{well}-{skip}.zarr",
101102
"stitch_workflow.output_directory": str(tmp_path / "out"),
103+
"stitch_workflow.docker": "",
102104
}
103105

104106
with open(tmp_path / "inputs.json", "wt") as out:
@@ -176,6 +178,7 @@ def test_ops_wdl(tmp_path):
176178
"ops_workflow.mark_stitch_boundary_cells": False,
177179
"ops_workflow.reads_labels": "cell",
178180
"ops_workflow.merge_extra_arguments": "--format parquet",
181+
"ops_workflow.docker": "",
179182
}
180183

181184
with open(tmp_path / "inputs.json", "wt") as out:

wdl/ops_workflow.wdl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ workflow ops_workflow {
6464
String? reads_threshold_peaks_crosstalk
6565
String? reads_extra_arguments
6666

67-
String model_dir = "s3://bigdipir-ctg-s3/models/"
67+
String model_dir = ""
6868

6969
# nuclei segment
7070
String? nuclei_segmentation
@@ -151,11 +151,11 @@ workflow ops_workflow {
151151
String cell_intersects_boundary_disks = "local-disk 200 HDD"
152152

153153

154-
String docker = "563221710766.dkr.ecr.us-west-2.amazonaws.com/external/ctg/scallops:latest"
154+
String docker
155155

156156
Int preemptible = 0
157157
String zones = "us-west1-a us-west1-b us-west1-c"
158-
String aws_queue_arn = "arn:aws:batch:us-west-2:752311211819:job-queue/gred"
158+
String aws_queue_arn = ""
159159
Int max_retries = 0
160160

161161
String segment_suffix = "segment.zarr"

wdl/stitch_workflow.wdl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,11 @@ workflow stitch_workflow {
4242
Boolean? force_illumination_correction
4343
Boolean? force_stitch
4444

45-
String docker = "563221710766.dkr.ecr.us-west-2.amazonaws.com/external/ctg/scallops:latest"
45+
String docker
4646

4747
Int preemptible = 0
4848
String zones = "us-west1-a us-west1-b us-west1-c"
49-
String aws_queue_arn = "arn:aws:batch:us-west-2:752311211819:job-queue/gred"
49+
String aws_queue_arn = ""
5050
Int max_retries = 0
5151

5252
}

0 commit comments

Comments
 (0)