Skip to content

Commit 42ae59d

Browse files
committed
merge(07-02): bring in workflow wrapper, test JSON, and Dockstore entry
2 parents 68217a6 + b2f54a9 commit 42ae59d

File tree

7 files changed

+176
-14
lines changed

7 files changed

+176
-14
lines changed

.dockstore.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -333,4 +333,7 @@ workflows:
333333
primaryDescriptorPath: /pipes/WDL/workflows/classify_virnucpro_multi.wdl
334334
- name: align_and_generate_reads_report
335335
subclass: WDL
336-
primaryDescriptorPath: /pipes/WDL/workflows/align_and_generate_PAF.wdl
336+
primaryDescriptorPath: /pipes/WDL/workflows/align_and_generate_PAF.wdl
337+
- name: join_read_classifications
338+
subclass: WDL
339+
primaryDescriptorPath: /pipes/WDL/workflows/join_read_classifications.wdl

.planning/REQUIREMENTS.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -72,18 +72,18 @@
7272

7373
### Standalone Workflow
7474

75-
- [ ] **JRC-02**: Standalone workflow `pipes/WDL/workflows/join_read_classifications.wdl`
75+
- [x] **JRC-02**: Standalone workflow `pipes/WDL/workflows/join_read_classifications.wdl`
7676
- Imports `tasks_metagenomics.wdl`, calls task with alias `as join_reads`
7777
- `meta { allowNestedInputs: true }` for Terra-compatible test JSON
7878
- Passes `miniwdl check` validation
7979

8080
### Infrastructure
8181

82-
- [ ] **JRC-03**: Test input JSON `test/input/WDL/miniwdl-local/test_inputs-join_read_classifications-local.json`
82+
- [x] **JRC-03**: Test input JSON `test/input/WDL/miniwdl-local/test_inputs-join_read_classifications-local.json`
8383
- Placeholder paths for all 4 optional File inputs plus sample_id
8484
- Workflow-level input keys; follows existing test file conventions
8585

86-
- [ ] **JRC-04**: Dockstore registration entry in `.dockstore.yml` for `join_read_classifications.wdl`
86+
- [x] **JRC-04**: Dockstore registration entry in `.dockstore.yml` for `join_read_classifications.wdl`
8787
- Subclass: WDL
8888
- No `testParameterFiles` (placeholder paths not CI-runnable)
8989

.planning/ROADMAP.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ Plans:
115115
| 4. Core centrifuger Task | v3.0 | 1/1 | Complete | 2026-04-02 |
116116
| 5. centrifuger_single and centrifuger_multi Workflow Wrappers | v3.0 | 1/1 | Complete | 2026-04-02 |
117117
| 6. Test Input JSONs and Dockstore Registration | v3.0 | 1/1 | Complete | 2026-04-02 |
118-
| 7. join_read_classifications Task, Workflow, and Registration | v3.1 | 0/2 | Planning | - |
118+
| 7. join_read_classifications Task, Workflow, and Registration | v3.1 | 1/2 | In Progress| |
119119

120120
---
121121
*Last updated: 2026-04-02 — Phase 7 planned*

.planning/STATE.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,15 @@ gsd_state_version: 1.0
33
milestone: v1.0
44
milestone_name: milestone
55
current_phase: 07
6-
status: executing
7-
stopped_at: Completed 07-01-PLAN.md
8-
last_updated: "2026-04-02T21:32:52.208Z"
6+
status: verifying
7+
stopped_at: Completed 07-02-PLAN.md
8+
last_updated: "2026-04-02T21:39:26.485Z"
99
last_activity: 2026-04-02
1010
progress:
1111
total_phases: 4
12-
completed_phases: 3
13-
total_plans: 5
14-
completed_plans: 4
12+
completed_phases: 2
13+
total_plans: 4
14+
completed_plans: 3
1515
percent: 0
1616
---
1717

@@ -30,7 +30,7 @@ Phase: 07 (turn-join-read-classifications-py-script-into-a-wdl-pipeline-task-in-
3030
Plan: 2 of 2
3131
Milestone: v3.0 Centrifuger Taxonomic Classification WDL
3232
Current phase: 07
33-
Status: Ready to execute
33+
Status: Phase complete — ready for verification
3434
Last activity: 2026-04-02
3535

3636
Progress: [░░░░░░░░░░] 0% (0/3 phases complete)
@@ -65,6 +65,8 @@ Recent decisions affecting current work:
6565
- [Phase 05]: Sample names derived from basename(bam, .bam) inside bash loop — String sample_name input removed
6666
- [Phase 06]: No testParameterFiles on centrifuger Dockstore entries — placeholder paths not CI-runnable
6767
- [Phase 07]: join_read_classifications task in tasks_metagenomics.wdl — File? optional inputs with __NONE__ sentinel, 16 GB/1 CPU runtime, DuckDB 4-way FULL OUTER JOIN logic embedded verbatim
68+
- [Phase 07]: Call alias as join_reads — WDL disallows call name = containing workflow name (per D-10)
69+
- [Phase 07]: No testParameterFiles in Dockstore entry for join_read_classifications — placeholder paths not CI-runnable
6870

6971
### Pending Todos
7072

@@ -83,7 +85,7 @@ None.
8385

8486
## Session Continuity
8587

86-
Last session: 2026-04-02T21:32:52.205Z
87-
Stopped at: Completed 07-01-PLAN.md
88+
Last session: 2026-04-02T21:39:22.390Z
89+
Stopped at: Completed 07-02-PLAN.md
8890
Resume file: None
8991
Next action: `/gsd:plan-phase 4`
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
---
2+
phase: 07-turn-join-read-classifications-py-script-into-a-wdl-pipeline-task-in-task-utils
3+
plan: 02
4+
subsystem: metagenomics
5+
tags: [wdl, dockstore, workflow-wrapper, parquet, kallisto, kraken2, virnucpro, genomad]
6+
7+
# Dependency graph
8+
requires:
9+
- phase: 07-01
10+
provides: join_read_classifications WDL task in tasks_metagenomics.wdl
11+
provides:
12+
- join_read_classifications standalone workflow wrapper
13+
- Placeholder test input JSON for join_read_classifications
14+
- Dockstore registration entry for join_read_classifications
15+
affects:
16+
- Dockstore discoverability of join_read_classifications on Terra/DNAnexus
17+
18+
# Tech tracking
19+
tech-stack:
20+
added: []
21+
patterns:
22+
- Standalone workflow wrapper importing tasks_metagenomics.wdl with task alias (as join_reads)
23+
- allowNestedInputs: true for Terra-compatible placeholder test JSON
24+
25+
key-files:
26+
created:
27+
- pipes/WDL/workflows/join_read_classifications.wdl
28+
- test/input/WDL/miniwdl-local/test_inputs-join_read_classifications-local.json
29+
modified:
30+
- .dockstore.yml
31+
32+
key-decisions:
33+
- "Call alias as join_reads — WDL disallows call name = containing workflow name (D-10)"
34+
- "No testParameterFiles in Dockstore entry — placeholder paths not CI-runnable (D-09)"
35+
- "allowNestedInputs: true — required for Terra-compatible test JSON with workflow-level input keys"
36+
37+
patterns-established:
38+
- "Workflow wrapper with task alias avoids call-name = workflow-name collision"
39+
40+
requirements-completed:
41+
- JRC-02
42+
- JRC-03
43+
- JRC-04
44+
45+
# Metrics
46+
duration: 2min
47+
completed: 2026-04-02
48+
---
49+
50+
# Phase 7 Plan 02: join_read_classifications Workflow Wrapper Summary
51+
52+
**Standalone workflow wrapper + placeholder test JSON + Dockstore entry for join_read_classifications — completing full task+workflow+test+registration treatment consistent with all prior phases**
53+
54+
## Performance
55+
56+
- **Duration:** 2 min
57+
- **Started:** 2026-04-02T21:37:02Z
58+
- **Completed:** 2026-04-02T21:38:35Z
59+
- **Tasks:** 2
60+
- **Files modified:** 3
61+
62+
## Accomplishments
63+
64+
- Created `join_read_classifications.wdl` workflow wrapper importing `tasks_metagenomics.wdl` and calling the task with alias `join_reads` (per D-10)
65+
- Added `allowNestedInputs: true` in meta block for Terra-compatible test JSON
66+
- All 4 optional `File?` inputs and required `String sample_id` passed through to task; `classifications_parquet` exposed as workflow output
67+
- `miniwdl check` exits 0 with no errors
68+
- Created placeholder test input JSON with workflow-level keys for all 5 inputs
69+
- Appended `join_read_classifications` entry to `.dockstore.yml` with `subclass: WDL`, no `testParameterFiles`
70+
71+
## Task Commits
72+
73+
Each task was committed atomically:
74+
75+
1. **Task 1: Create standalone workflow wrapper** - `d05801d0` (feat)
76+
2. **Task 2: Create test input JSON and Dockstore entry** - `a812d49b` (feat)
77+
78+
## Files Created/Modified
79+
80+
- `pipes/WDL/workflows/join_read_classifications.wdl` - Standalone workflow wrapper (33 lines)
81+
- `test/input/WDL/miniwdl-local/test_inputs-join_read_classifications-local.json` - Placeholder test JSON (7 lines)
82+
- `.dockstore.yml` - Appended join_read_classifications entry (3 lines added)
83+
84+
## Decisions Made
85+
86+
- Used call alias `as join_reads` — WDL disallows a call with the same name as the enclosing workflow. This follows the established pattern (`parse_reads`, `classify_contigs`, `classify_reads`).
87+
- No `testParameterFiles` in Dockstore entry — placeholder paths are not CI-runnable (consistent with all prior phase entries per STATE.md decision log).
88+
- `allowNestedInputs: true` added to meta — required for Terra to accept the `workflow.input` key format in test JSON.
89+
90+
## Deviations from Plan
91+
92+
None - plan executed exactly as written.
93+
94+
## Issues Encountered
95+
96+
None.
97+
98+
## User Setup Required
99+
100+
None - no external service configuration required.
101+
102+
## Next Phase Readiness
103+
104+
- Phase 07 is complete — `join_read_classifications` task + workflow + test + registration fully implemented
105+
- The workflow is discoverable on Dockstore and callable as a standalone workflow on Terra/DNAnexus
106+
107+
## Self-Check: PASSED
108+
109+
- FOUND: pipes/WDL/workflows/join_read_classifications.wdl
110+
- FOUND: test/input/WDL/miniwdl-local/test_inputs-join_read_classifications-local.json
111+
- FOUND: join_read_classifications entry in .dockstore.yml
112+
- FOUND: commit d05801d0
113+
- FOUND: commit a812d49b
114+
115+
---
116+
*Phase: 07-turn-join-read-classifications-py-script-into-a-wdl-pipeline-task-in-task-utils*
117+
*Completed: 2026-04-02*
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
version 1.0
2+
3+
import "../tasks/tasks_metagenomics.wdl" as metagenomics
4+
5+
workflow join_read_classifications {
6+
meta {
7+
description: "Join read-level classifications from Kallisto, Kraken2, VirNucPro, and geNomad into a single ZSTD-compressed Parquet file keyed on SAMPLE_ID + READ_ID."
8+
author: "Broad Viral Genomics"
9+
email: "viral-ngs@broadinstitute.org"
10+
allowNestedInputs: true
11+
}
12+
13+
input {
14+
File? kallisto_summary
15+
File? kraken2_reads
16+
File? vnp_reads
17+
File? genomad_virus_summary
18+
String sample_id
19+
}
20+
21+
call metagenomics.join_read_classifications as join_reads {
22+
input:
23+
kallisto_summary = kallisto_summary,
24+
kraken2_reads = kraken2_reads,
25+
vnp_reads = vnp_reads,
26+
genomad_virus_summary = genomad_virus_summary,
27+
sample_id = sample_id
28+
}
29+
30+
output {
31+
File classifications_parquet = join_reads.classifications_parquet
32+
}
33+
}
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
{
2+
"join_read_classifications.kallisto_summary": "test/input/placeholder_kallisto_summary.parquet",
3+
"join_read_classifications.kraken2_reads": "test/input/placeholder_kraken2_reads.parquet",
4+
"join_read_classifications.vnp_reads": "test/input/placeholder_vnp_reads.parquet",
5+
"join_read_classifications.genomad_virus_summary": "test/input/placeholder_genomad_virus_summary.tsv",
6+
"join_read_classifications.sample_id": "placeholder_sample"
7+
}

0 commit comments

Comments
 (0)