Skip to content

Commit c4d5dd2

Browse files
rxu17danlu1Chelsea-Na
authored
RE-MERGE [GEN-2152] Create user defined maf center list (#41)
* draft pr to allow process center list by editting nf-genie code only * allow user-defined maf centers in maf_processing * when maf_centers is ALL, enforce processing the first center first * enable state dependency in process_maf * update comments * fix a typo * add all_centers for TESTING project * rename methods and fix typo * emphasize append only if using user-defined maf_center list * [GEN-2152] Add max forks to limit concurrency (#40) * add missing 19 series, correct 20 series in map * [GEN-2152] Allow user-defined maf center list in maf processing step (#38) * allow user-defined maf centers in maf_processing * when maf_centers is ALL, enforce processing the first center first * enable state dependency in process_maf * add all_centers for TESTING project * Revert "[GEN-2152] Allow user-defined maf center list in maf processing step …" (#39) This reverts commit 96d1d02. * add max forks to limit concurrency --------- Co-authored-by: Chelsea-Na <109613735+Chelsea-Na@users.noreply.github.com> Co-authored-by: Dan Lu <90745557+danlu1@users.noreply.github.com> --------- Co-authored-by: danlu1 <dan.lu@sagebase.org> Co-authored-by: Chelsea-Na <109613735+Chelsea-Na@users.noreply.github.com> Co-authored-by: Dan Lu <90745557+danlu1@users.noreply.github.com>
1 parent 07107a0 commit c4d5dd2

File tree

4 files changed

+76
-4
lines changed

4 files changed

+76
-4
lines changed

README.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,10 +82,23 @@ Note that all the docker parameters have set default docker containers based on
8282
```
8383
8484
* Processes **mutation** files on test pipeline
85-
85+
1. To execute the MAF process for all centers, you can either specify the `maf_centers` as "ALL" or leave it blank.
8686
```
8787
nextflow run main.nf -profile aws_test --process_type maf_process --create_new_maf_db -with-docker ghcr.io/sage-bionetworks/genie:main
8888
```
89+
Or
90+
```
91+
nextflow run main.nf -profile aws_test --process_type maf_process --maf_centers ALL --create_new_maf_db -with-docker ghcr.io/sage-bionetworks/genie:main
92+
```
93+
2. To execute the MAF process for a single center, you can specify the `maf_centers` parameter using the name of that center.
94+
```
95+
nextflow run main.nf -profile aws_test --process_type maf_process --maf_centers TEST --create_new_maf_db -with-docker ghcr.io/sage-bionetworks/genie:main
96+
```
97+
98+
3. To execute the MAF process for multiple centers, you can specify the `maf_centers` as a comma-separated list of center names and **append** results to the MAF table.
99+
```
100+
nextflow run main.nf -profile aws_test --process_type maf_process --maf_centers TEST,SAGE --create_new_maf_db false -with-docker ghcr.io/sage-bionetworks/genie:main
101+
```
89102
90103
* Runs **processing** and **consortium** release (including data guide creation) on test pipeline
91104
```

main.nf

Lines changed: 53 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ include { reset_processing } from './modules/reset_processing'
1414
include { validate_data } from './modules/validate_data'
1515
include { process_main } from './modules/process_main'
1616
include { process_maf } from './modules/process_maf'
17+
include { process_maf as process_maf_remaining_centers} from './modules/process_maf'
1718

1819
// SET PARAMETERS
1920

@@ -29,6 +30,10 @@ params.center = "ALL"
2930
params.create_new_maf_db = false
3031
// release name (pass in TEST.public to test the public release scripts)
3132
params.release = "TEST.consortium"
33+
// List all centers to be processed for maf_process
34+
params.maf_centers = "ALL"
35+
// List of centers to be processed converted from params.maf_centers
36+
maf_center_list = params.maf_centers?.split(",").toList()
3237

3338
// Validate input parameters
3439
WorkflowMain.initialise(workflow, params, log)
@@ -90,6 +95,7 @@ if (major_release == "TEST") {
9095
center_map_synid = "syn11601248"
9196
is_prod = false
9297
is_staging = false
98+
9399
} else if (major_release == "STAGING"){
94100
project_id = "syn22033066"
95101
center_map_synid = "syn22089188"
@@ -104,6 +110,49 @@ else {
104110
is_staging = false
105111
}
106112

113+
// Extract center list for MAF processing
114+
if (major_release == "TEST"){
115+
all_centers = ["SAGE", "TEST", "GOLD"]
116+
} else {
117+
all_centers = ["JHU","DFCI","GRCC","NKI","MSK","UHN","VICC","MDA","WAKE","YALE","UCSF","CRUK","CHOP","VHIO","SCI","PROV","COLU","UCHI","DUKE","UMIAMI"]
118+
}
119+
if (params.maf_centers == "ALL") {
120+
maf_center_list = all_centers
121+
}
122+
123+
def process_maf_helper(maf_centers, ch_project_id, maf_center_list, create_new_maf_db) {
124+
/**
125+
* Processes MAF files for a given center list.
126+
*
127+
* @param maf_centers Parameter containing the centers to be processed, can be "ALL" or a comma-separated list
128+
* @param ch_project_id Channel with project ID
129+
* @param maf_center_list List of centers to be processed converted from params.maf_centers
130+
* @param create_new_maf_db Boolean flag to create new DB
131+
* @return A collect output of the MAF process
132+
*/
133+
134+
// Create a channel from the list of centers
135+
ch_maf_centers = Channel.fromList(maf_center_list)
136+
// placeholder for previous output
137+
previous = "default"
138+
// If maf_centers is "ALL", we will process all centers in the maf_center_list
139+
if (maf_centers == "ALL") {
140+
// Create a channel to indicate whether it's the first center or not
141+
ch_maf_centers = ch_maf_centers.branch { v ->
142+
first: v == maf_center_list[0]
143+
remaining: v != maf_center_list[0]
144+
}
145+
// Process the first center with createNewMafDb as true
146+
process_maf_first_center = process_maf(previous, ch_project_id, ch_maf_centers.first, true).collect()
147+
// Process the rest with createNewMafDb as false
148+
return process_maf_remaining_centers(process_maf_first_center, ch_project_id, ch_maf_centers.remaining, false).collect()
149+
150+
} else {
151+
// Process centers as the specified maf center list
152+
return process_maf(previous, ch_project_id, ch_maf_centers, create_new_maf_db).collect()
153+
}
154+
}
155+
107156
workflow {
108157
ch_release = Channel.value(params.release)
109158
ch_project_id = Channel.value(project_id)
@@ -120,13 +169,14 @@ workflow {
120169
validate_data(ch_project_id, ch_center)
121170
// validate_data.out.view()
122171
} else if (params.process_type == "maf_process") {
123-
process_maf(ch_project_id, ch_center, params.create_new_maf_db)
172+
// Call the function
173+
process_maf_helper(params.maf_centers, ch_project_id, maf_center_list, params.create_new_maf_db)
124174
// process_maf.out.view()
125175
} else if (params.process_type == "main_process") {
126176
process_main("default", ch_project_id, ch_center)
127177
} else if (params.process_type == "consortium_release") {
128-
process_maf(ch_project_id, ch_center, params.create_new_maf_db)
129-
process_main(process_maf.out, ch_project_id, ch_center)
178+
process_maf_col = process_maf_helper(params.maf_centers, ch_project_id, maf_center_list, params.create_new_maf_db)
179+
process_main(process_maf_col, ch_project_id, ch_center)
130180
create_consortium_release(process_main.out, ch_release, ch_is_prod, ch_seq_date, ch_is_staging)
131181
create_data_guide(create_consortium_release.out, ch_release, ch_project_id)
132182
if (!is_staging) {

modules/process_maf.nf

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,17 @@
11
process process_maf {
2+
maxForks 1
23
debug true
34
container "$params.main_pipeline_docker"
45
secret 'SYNAPSE_AUTH_TOKEN'
56

67
input:
8+
val previous
79
val proj_id
810
val center
911
val create_new_maf_db
1012

13+
tag "processing_${center}"
14+
1115
output:
1216
stdout
1317

nextflow_schema.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,11 @@
3131
"description": "Pick a center to process or validate. Defaults to ALL which means all centers. This value should be ALL if you pick consortium release or public release.",
3232
"default": "ALL"
3333
},
34+
"maf_centers": {
35+
"type": "string",
36+
"description": "The list of centers to be processed in the MAF processing. Defaults to ALL which means all centers.",
37+
"default": "ALL"
38+
},
3439
"create_new_maf_db": {
3540
"type": "boolean",
3641
"description": "Create a new maf Synapse Table. Toggle this for every consortium release."

0 commit comments

Comments
 (0)