Skip to content

Commit db3bb48

Browse files
committed
Standardized stat names, placing aggregator function on the right (e.g. read_count, depth_mean, etc).
1 parent eeb33f5 commit db3bb48

9 files changed

Lines changed: 84 additions & 71 deletions

File tree

docs/bam_statistics.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,14 @@ A histogram of read qualities, using only records marked `prim` or `unmapped`. T
2424

2525
A histogram of mapping qualities and gap-compressed identities, respectively.
2626

27-
## `stat_num_reads`, `stat_read_length_mean`, `stat_read_length_median`, `stat_read_length_n50`, `stat_read_quality_mean`, `stat_read_quality_median`
27+
## `stat_read_count`, `stat_read_length_mean`, `stat_read_length_median`, `stat_read_length_n50`, `stat_read_quality_mean`, `stat_read_quality_median`
2828

2929
Statistics computed using only records marked `prim` or `unmapped`.
3030

3131
## `stat_mapped_read_count`, `stat_mapped_percent`
3232

3333
Count of primary alignments, and primary alignments as a percentage of total reads.
3434

35-
## `stat_mean_gap_compressed_identity`
35+
## `stat_gap_compressed_identity_mean`, `stat_gap_compressed_identity_median`
3636

37-
Mean gap-compressed identity of primary and supplementary alignments.
37+
Summary of gap-compressed identity for primary and supplementary alignments.

docs/family.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -180,17 +180,18 @@ The `Sample` struct contains sample specific data and metadata. The struct has t
180180
| Array\[File\] | mosdepth_depth_distribution_plot | | |
181181
| Array\[File\] | mapq_distribution_plot | Distribution of mapping quality per alignment | |
182182
| Array\[File\] | mg_distribution_plot | Distribution of gap-compressed identity score per alignment | |
183-
| Array\[String\] | stat_num_reads | Number of reads | |
183+
| Array\[String\] | stat_read_count | Number of reads | |
184184
| Array\[String\] | stat_read_length_mean | Mean read length | |
185185
| Array\[String\] | stat_read_length_median | Median read length | |
186186
| Array\[String\] | stat_read_length_n50 | Read length N50 | |
187187
| Array\[String\] | stat_read_quality_mean | Mean read quality | |
188188
| Array\[String\] | stat_read_quality_median | Median read quality | |
189189
| Array\[String\] | stat_mapped_read_count | Count of reads mapped to reference | |
190-
| Array\[String\] | stat_mapped_percent | Percent of reads mapped to reference | |
191-
| Array\[String\] | stat_mean_gap_compressed_identity | Mean gap-compressed identity | |
190+
| Array\[String\] | stat_mapped_read_percent | Percent of reads mapped to reference | |
191+
| Array\[String\] | stat_gap_compressed_identity_mean | Mean gap-compressed identity | |
192+
| Array\[String\] | stat_gap_compressed_identity_median | Median gap-compressed identity | |
192193
| Array\[String\] | inferred_sex | Inferred sex | Sex is inferred based on relative depth of chrY alignments. |
193-
| Array\[String\] | stat_mean_depth | Mean depth | |
194+
| Array\[String\] | stat_depth_mean | Mean depth | |
194195

195196
### Small Variants (<50 bp)
196197

docs/singleton.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -136,17 +136,18 @@ flowchart TD
136136
| File | mosdepth_depth_distribution_plot | | |
137137
| File | mapq_distribution_plot | Distribution of mapping quality per alignment | |
138138
| File | mg_distribution_plot | Distribution of gap-compressed identity score per alignment | |
139-
| String | stat_num_reads | Number of reads | |
139+
| String | stat_read_count | Number of reads | |
140140
| String | stat_read_length_mean | Mean read length | |
141141
| String | stat_read_length_median | Median read length | |
142142
| String | stat_read_length_n50 | Read length N50 | |
143143
| String | stat_read_quality_mean | Mean read quality | |
144144
| String | stat_read_quality_median | Median read quality | |
145145
| String | stat_mapped_read_count | Count of reads mapped to reference | |
146-
| String | stat_mapped_percent | Percent of reads mapped to reference | |
147-
| String | stat_mean_gap_compressed_identity | Mean gap-compressed identity | |
146+
| String | stat_mapped_read_percent | Percent of reads mapped to reference | |
147+
| String | stat_gap_compressed_identity_mean | Mean gap-compressed identity | |
148+
| String | stat_gap_compressed_identity_median | Median gap-compressed identity | |
148149
| String | inferred_sex | Inferred sex | Sex is inferred based on relative depth of chrY alignments. |
149-
| String | stat_mean_depth | Mean depth | |
150+
| String | stat_depth_mean | Mean depth | |
150151

151152
### Small Variants (<50 bp)
152153

wdl-ci.config.json

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -377,7 +377,7 @@
377377
"png_validator"
378378
]
379379
},
380-
"stat_num_reads": {
380+
"stat_read_count": {
381381
"value": "27398",
382382
"test_tasks": [
383383
"compare_string"
@@ -419,13 +419,19 @@
419419
"compare_string"
420420
]
421421
},
422-
"stat_mapped_percent": {
422+
"stat_mapped_read_percent": {
423423
"value": "100.0",
424424
"test_tasks": [
425425
"compare_string"
426426
]
427427
},
428-
"stat_mean_gap_compressed_identity": {
428+
"stat_gap_compressed_identity_mean": {
429+
"value": "99.77",
430+
"test_tasks": [
431+
"compare_string"
432+
]
433+
},
434+
"stat_gap_compressed_identity_median": {
429435
"value": "99.77",
430436
"test_tasks": [
431437
"compare_string"
@@ -1225,7 +1231,7 @@
12251231
"png_validator"
12261232
]
12271233
},
1228-
"stat_mean_depth": {
1234+
"stat_depth_mean": {
12291235
"value": "0.07",
12301236
"test_tasks": [
12311237
"compare_string"
@@ -1274,7 +1280,7 @@
12741280
"png_validator"
12751281
]
12761282
},
1277-
"stat_mean_depth": {
1283+
"stat_depth_mean": {
12781284
"value": "0.07",
12791285
"test_tasks": [
12801286
"compare_string"

workflows/downstream/downstream.wdl

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -216,21 +216,22 @@ workflow downstream {
216216
String stat_phase_block_ng50 = hiphase.stat_phase_block_ng50
217217

218218
# bam stats
219-
File bam_statistics = bam_stats.bam_statistics
220-
File read_length_plot = bam_stats.read_length_plot
221-
File? read_quality_plot = bam_stats.read_quality_plot
222-
File mapq_distribution_plot = bam_stats.mapq_distribution_plot
223-
File mg_distribution_plot = bam_stats.mg_distribution_plot
224-
String stat_num_reads = bam_stats.stat_num_reads
225-
String stat_read_length_mean = bam_stats.stat_read_length_mean
226-
String stat_read_length_median = bam_stats.stat_read_length_median
227-
String stat_read_length_n50 = bam_stats.stat_read_length_n50
228-
String stat_read_quality_mean = bam_stats.stat_read_quality_mean
229-
String stat_read_quality_median = bam_stats.stat_read_quality_median
230-
String stat_mapped_read_count = bam_stats.stat_mapped_read_count
231-
String stat_mapped_percent = bam_stats.stat_mapped_percent
232-
String stat_mean_gap_compressed_identity = bam_stats.stat_mean_gap_compressed_identity
233-
File trgt_coverage_dropouts = coverage_dropouts.dropouts
219+
File bam_statistics = bam_stats.bam_statistics
220+
File read_length_plot = bam_stats.read_length_plot
221+
File? read_quality_plot = bam_stats.read_quality_plot
222+
File mapq_distribution_plot = bam_stats.mapq_distribution_plot
223+
File mg_distribution_plot = bam_stats.mg_distribution_plot
224+
String stat_read_count = bam_stats.stat_read_count
225+
String stat_read_length_mean = bam_stats.stat_read_length_mean
226+
String stat_read_length_median = bam_stats.stat_read_length_median
227+
String stat_read_length_n50 = bam_stats.stat_read_length_n50
228+
String stat_read_quality_mean = bam_stats.stat_read_quality_mean
229+
String stat_read_quality_median = bam_stats.stat_read_quality_median
230+
String stat_mapped_read_count = bam_stats.stat_mapped_read_count
231+
String stat_mapped_read_percent = bam_stats.stat_mapped_read_percent
232+
String stat_gap_compressed_identity_mean = bam_stats.stat_gap_compressed_identity_mean
233+
String stat_gap_compressed_identity_median = bam_stats.stat_gap_compressed_identity_median
234+
File trgt_coverage_dropouts = coverage_dropouts.dropouts
234235

235236
# small variant stats
236237
File small_variant_stats = bcftools_stats_roh_small_variants.stats

workflows/family.wdl

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -236,16 +236,17 @@ workflow humanwgs_family {
236236
237237
Map[String, Array[String]] stats = {
238238
'sample_id': sample_id,
239-
'num_reads': downstream.stat_num_reads,
239+
'read_count': downstream.stat_read_count,
240240
'read_length_mean': downstream.stat_read_length_mean,
241241
'read_length_median': downstream.stat_read_length_median,
242242
'read_length_n50': downstream.stat_read_length_n50,
243243
'read_quality_mean': downstream.stat_read_quality_mean,
244244
'read_quality_median': downstream.stat_read_quality_median,
245245
'mapped_read_count': downstream.stat_mapped_read_count,
246-
'mapped_percent': downstream.stat_mapped_percent,
247-
'mean_gap_compressed_identity': downstream.stat_mean_gap_compressed_identity,
248-
'mean_depth': upstream.stat_mean_depth,
246+
'mapped_read_percent': downstream.stat_mapped_read_percent,
247+
'gap_compressed_identity_mean': downstream.stat_gap_compressed_identity_mean,
248+
'gap_compressed_identity_median': downstream.stat_gap_compressed_identity_median,
249+
'depth_mean': upstream.stat_depth_mean,
249250
'inferred_sex': upstream.inferred_sex,
250251
'stat_phased_basepairs': downstream.stat_phased_basepairs,
251252
'phase_block_ng50': downstream.stat_phase_block_ng50,
@@ -284,20 +285,21 @@ workflow humanwgs_family {
284285
File msg_file = consolidate_stats.messages
285286

286287
# bam stats
287-
Array[File] bam_statistics = downstream.bam_statistics
288-
Array[File] read_length_plot = downstream.read_length_plot
289-
Array[File?] read_quality_plot = downstream.read_quality_plot
290-
Array[File] mapq_distribution_plot = downstream.mapq_distribution_plot
291-
Array[File] mg_distribution_plot = downstream.mg_distribution_plot
292-
Array[String] stat_num_reads = downstream.stat_num_reads
293-
Array[String] stat_read_length_mean = downstream.stat_read_length_mean
294-
Array[String] stat_read_length_median = downstream.stat_read_length_median
295-
Array[String] stat_read_length_n50 = downstream.stat_read_length_n50
296-
Array[String] stat_read_quality_mean = downstream.stat_read_quality_mean
297-
Array[String] stat_read_quality_median = downstream.stat_read_quality_median
298-
Array[String] stat_mapped_read_count = downstream.stat_mapped_read_count
299-
Array[String] stat_mapped_percent = downstream.stat_mapped_percent
300-
Array[String] stat_mean_gap_compressed_identity = downstream.stat_mean_gap_compressed_identity
288+
Array[File] bam_statistics = downstream.bam_statistics
289+
Array[File] read_length_plot = downstream.read_length_plot
290+
Array[File?] read_quality_plot = downstream.read_quality_plot
291+
Array[File] mapq_distribution_plot = downstream.mapq_distribution_plot
292+
Array[File] mg_distribution_plot = downstream.mg_distribution_plot
293+
Array[String] stat_read_count = downstream.stat_read_count
294+
Array[String] stat_read_length_mean = downstream.stat_read_length_mean
295+
Array[String] stat_read_length_median = downstream.stat_read_length_median
296+
Array[String] stat_read_length_n50 = downstream.stat_read_length_n50
297+
Array[String] stat_read_quality_mean = downstream.stat_read_quality_mean
298+
Array[String] stat_read_quality_median = downstream.stat_read_quality_median
299+
Array[String] stat_mapped_read_count = downstream.stat_mapped_read_count
300+
Array[String] stat_mapped_read_percent = downstream.stat_mapped_read_percent
301+
Array[String] stat_gap_compressed_identity_mean = downstream.stat_gap_compressed_identity_mean
302+
Array[String] stat_gap_compressed_identity_median = downstream.stat_gap_compressed_identity_median
301303

302304
# merged, haplotagged alignments
303305
Array[File] merged_haplotagged_bam = downstream.merged_haplotagged_bam
@@ -308,7 +310,7 @@ workflow humanwgs_family {
308310
Array[File] mosdepth_region_bed = upstream.mosdepth_region_bed
309311
Array[File] mosdepth_region_bed_index = upstream.mosdepth_region_bed_index
310312
Array[File] mosdepth_depth_distribution_plot = upstream.mosdepth_depth_distribution_plot
311-
Array[String] stat_mean_depth = upstream.stat_mean_depth
313+
Array[String] stat_depth_mean = upstream.stat_depth_mean
312314
Array[String] inferred_sex = upstream.inferred_sex
313315

314316
# phasing stats

workflows/singleton.wdl

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -186,16 +186,17 @@ workflow humanwgs_singleton {
186186
187187
Map[String, Array[String]] stats = {
188188
'sample_id': [sample_id],
189-
'num_reads': [downstream.stat_num_reads],
189+
'read_count': [downstream.stat_read_count],
190190
'read_length_mean': [downstream.stat_read_length_mean],
191191
'read_length_median': [downstream.stat_read_length_median],
192192
'read_length_n50': [downstream.stat_read_length_n50],
193193
'read_quality_mean': [downstream.stat_read_quality_mean],
194194
'read_quality_median': [downstream.stat_read_quality_median],
195195
'mapped_read_count': [downstream.stat_mapped_read_count],
196-
'mapped_percent': [downstream.stat_mapped_percent],
197-
'mean_gap_compressed_identity': [downstream.stat_mean_gap_compressed_identity],
198-
'mean_depth': [upstream.stat_mean_depth],
196+
'mapped_read_percent': [downstream.stat_mapped_read_percent],
197+
'gap_compressed_identity_mean': [downstream.stat_gap_compressed_identity_mean],
198+
'gap_compressed_identity_median': [downstream.stat_gap_compressed_identity_median],
199+
'depth_mean': [upstream.stat_depth_mean],
199200
'inferred_sex': [upstream.inferred_sex],
200201
'stat_phased_basepairs': [downstream.stat_phased_basepairs],
201202
'phase_block_ng50': [downstream.stat_phase_block_ng50],
@@ -233,20 +234,21 @@ workflow humanwgs_singleton {
233234
File msg_file = consolidate_stats.messages
234235

235236
# bam stats
236-
File bam_statistics = downstream.bam_statistics
237-
File read_length_plot = downstream.read_length_plot
238-
File? read_quality_plot = downstream.read_quality_plot
239-
File mapq_distribution_plot = downstream.mapq_distribution_plot
240-
File mg_distribution_plot = downstream.mg_distribution_plot
241-
String stat_num_reads = downstream.stat_num_reads
242-
String stat_read_length_mean = downstream.stat_read_length_mean
243-
String stat_read_length_median = downstream.stat_read_length_median
244-
String stat_read_length_n50 = downstream.stat_read_length_n50
245-
String stat_read_quality_mean = downstream.stat_read_quality_mean
246-
String stat_read_quality_median = downstream.stat_read_quality_median
247-
String stat_mapped_read_count = downstream.stat_mapped_read_count
248-
String stat_mapped_percent = downstream.stat_mapped_percent
249-
String stat_mean_gap_compressed_identity = downstream.stat_mean_gap_compressed_identity
237+
File bam_statistics = downstream.bam_statistics
238+
File read_length_plot = downstream.read_length_plot
239+
File? read_quality_plot = downstream.read_quality_plot
240+
File mapq_distribution_plot = downstream.mapq_distribution_plot
241+
File mg_distribution_plot = downstream.mg_distribution_plot
242+
String stat_read_count = downstream.stat_read_count
243+
String stat_read_length_mean = downstream.stat_read_length_mean
244+
String stat_read_length_median = downstream.stat_read_length_median
245+
String stat_read_length_n50 = downstream.stat_read_length_n50
246+
String stat_read_quality_mean = downstream.stat_read_quality_mean
247+
String stat_read_quality_median = downstream.stat_read_quality_median
248+
String stat_mapped_read_count = downstream.stat_mapped_read_count
249+
String stat_mapped_read_percent = downstream.stat_mapped_read_percent
250+
String stat_gap_compressed_identity_mean = downstream.stat_gap_compressed_identity_mean
251+
String stat_gap_compressed_identity_median = downstream.stat_gap_compressed_identity_median
250252

251253
# merged, haplotagged alignments
252254
File merged_haplotagged_bam = downstream.merged_haplotagged_bam
@@ -257,7 +259,7 @@ workflow humanwgs_singleton {
257259
File mosdepth_region_bed = upstream.mosdepth_region_bed
258260
File mosdepth_region_bed_index = upstream.mosdepth_region_bed_index
259261
File mosdepth_depth_distribution_plot = upstream.mosdepth_depth_distribution_plot
260-
String stat_mean_depth = upstream.stat_mean_depth
262+
String stat_depth_mean = upstream.stat_depth_mean
261263
String inferred_sex = upstream.inferred_sex
262264

263265
# phasing stats

workflows/upstream/upstream.wdl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -272,7 +272,7 @@ workflow upstream {
272272
File mosdepth_region_bed_index = mosdepth.region_bed_index
273273
File mosdepth_depth_distribution_plot = mosdepth.depth_distribution_plot
274274
String inferred_sex = mosdepth.inferred_sex
275-
String stat_mean_depth = mosdepth.stat_mean_depth
275+
String stat_depth_mean = mosdepth.stat_depth_mean
276276

277277
# per sample sv signatures
278278
File discover_tar = sawfish_discover.discover_tar

0 commit comments

Comments
 (0)