Skip to content

Commit 46f3aeb

Browse files
PatrickTCorbettgithub-actions[bot]pre-commit-ci[bot]micahwiesner67natemcintosh
authored
344 create dynamic low case count thresholds (#353)
* add command line and config options for low case count thresholds * incorporate thresholds into diagnostic file * precommit * Update R/pipeline.R Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update R/diagnostics.R Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update R/diagnostics.R Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update R/diagnostics.R Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * change to use thresholds from constants file * pre-commit * edit low case count tests * add aconfigs to the config.R file * test Config.Rd * add .Rd items * edit to roxygen documentation * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * precommit * edit test config files * edit to test-diagnostics.R * account for tests in diagnostics.R * precommit tests * tests edit * triage test errors * double brackets to resolve error * add low case count threshold to metadata * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * edit tests * spacing * undo dockerfile test * updating documentation * Update R/find_low_count_threshold.R Co-authored-by: Nate McIntosh <NMcIntosh@cdc.gov> * Update R/diagnostics.R Co-authored-by: Nate McIntosh <NMcIntosh@cdc.gov> * Update R/diagnostics.R Co-authored-by: Nate McIntosh <NMcIntosh@cdc.gov> * Update R/diagnostics.R Co-authored-by: Nate McIntosh <NMcIntosh@cdc.gov> * fix specifying disease * roxygen edits * additional document edits * precommit * additional roxygen edit * Update R/diagnostics.R Co-authored-by: Katie Gostic <uep6@cdc.gov> * add example for creating thresholds value in R + Roxygen edits * roxygen --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Micah Wiesner <micahwiesner67@gmail.com> Co-authored-by: Micah Wiesner <33739832+micahwiesner67@users.noreply.github.com> Co-authored-by: Nate McIntosh <NMcIntosh@cdc.gov> Co-authored-by: Katie Gostic <uep6@cdc.gov>
1 parent 69f62ab commit 46f3aeb

16 files changed

Lines changed: 155 additions & 21 deletions

NAMESPACE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ export(format_generation_interval)
1717
export(format_right_truncation)
1818
export(format_stan_opts)
1919
export(low_case_count_diagnostic)
20+
export(low_case_count_threshold)
2021
export(orchestrate_pipeline)
2122
export(process_quantiles)
2223
export(process_samples)

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# CFAEpiNow2Pipeline v0.2.0
22

33
## Features
4+
5+
* add pathogen-specific low case count thresholds options
46
* Updating make commands to be called from docker image locally
57
* Updating pre-commit to block specific test.parquet file
68
* Scheduling `make run-prod` on Github Actions Wednesdays at 8 AM ET

R/config.R

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,9 @@ Data <- S7::new_class(
132132
#' Formatted as "YYYY-MM-DD".
133133
#' @param disease A string specifying the disease being modeled. One of
134134
#' `"COVID-19"` or `"Influenza"` or `"RSV"`.
135+
#' @param low_case_count_thresholds A named list of thresholds to use for
136+
#' determining n_low_case_count in diagnostic file
137+
#' Example: list(`COVID-19` = 10, `Influenza`` = 10, `RSV` = 5)
135138
#' @param geo_value An uppercase, two-character string specifying the geographic
136139
#' value, usually a state or `"US"` for national data.
137140
#' @param geo_type A string specifying the geographic type, usually "state".
@@ -168,6 +171,7 @@ Config <- S7::new_class(
168171
report_date = S7::class_character,
169172
production_date = S7::class_character,
170173
disease = S7::class_character,
174+
low_case_count_thresholds = S7::class_list,
171175
geo_value = S7::class_character,
172176
geo_type = S7::class_character,
173177
seed = S7::class_integer,

R/diagnostics.R

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@
99
#'
1010
#' @param fit The model fit object from `EpiNow2`
1111
#' @param data A data frame containing the input data used in the model fit.
12+
#' @param low_count_threshold an integer that determines cutoff for
13+
#' determining low_case_count flag. If the jurisdiction has less than
14+
#' X DDI counts for the respective pathogen, it will be considered
15+
#' as having too few cases and later on in post-processing the
16+
#' Rt estimate and growth category will be edited to NA and
17+
#' "Not Estimated", respectively, in release
1218
#' @inheritParams Config
1319
#'
1420
#' @return A \code{data.frame} containing the extracted diagnostic metrics. The
@@ -18,7 +24,9 @@
1824
#' \item \code{value}: The value of the diagnostic metric.
1925
#' \item \code{job_id}: The unique identifier for the job.
2026
#' \item \code{task_id}: The unique identifier for the task.
21-
#' \item \code{disease,geo_value,model}: Metadata for downstream processing.
27+
#' \item \code{disease,geo_value,model}: Metadata
28+
#' for downstream processing.
29+
2230
#' }
2331
#'
2432
#' @details
@@ -49,13 +57,14 @@
4957
extract_diagnostics <- function(
5058
fit,
5159
data,
60+
low_count_threshold,
5261
job_id,
5362
task_id,
5463
disease,
5564
geo_value,
5665
model
5766
) {
58-
low_case_count <- low_case_count_diagnostic(data)
67+
low_case_count <- low_case_count_diagnostic(data, low_count_threshold)
5968

6069
epinow2_diagnostics <- rstan::get_sampler_params(
6170
fit$estimates$fit,
@@ -127,22 +136,30 @@ extract_diagnostics <- function(
127136

128137
#' Calculate low case count diagnostic flag
129138
#'
130-
#' The diagnostic flag is TRUE if either of the _last_ two weeks of the dataset
131-
#' have fewer than an aggregate 10 cases per week. This aggregation excludes the
132-
#' count from confirmed outliers, which have been set to NA in the data.
139+
#' The diagnostic flag is TRUE if either of the _last_ two weeks
140+
#' of the dataset have fewer than an aggregate X cases per week.
141+
#' See the low_case_count_threshold parameter for what the value
142+
#' of X is. This aggregation excludes the count from confirmed
143+
#' outliers, which have been set to NA in the data.
133144
#'
134145
#' This function assumes that the `df` input dataset has been
135146
#' "completed": that any implicit missingness has been made explicit.
136147
#'
137148
#' @param df A dataframe as returned by [read_data()]. The dataframe must
138149
#' include columns such as `reference_date` (a date vector) and `confirm`
139150
#' (the number of confirmed cases per day).
151+
#' @param low_count_threshold an integer that determines cutoff for
152+
#' determining low_case_count flag. If the jurisdiction has less than
153+
#' X ED visist for the respective pathogen, it will be considered
154+
#' as having too few cases and later on in post-processing the
155+
#' Rt estimate and growth category will be edited to NA and
156+
#' "Not Estimated", respectively
140157
#'
141158
#' @return A logical value (TRUE or FALSE) indicating whether either of the last
142159
#' two weeks in the dataset had fewer than 10 cases per week.
143160
#' @family diagnostics
144161
#' @export
145-
low_case_count_diagnostic <- function(df) {
162+
low_case_count_diagnostic <- function(df, low_count_threshold) {
146163
cli::cli_alert_info("Calculating low case count diagnostic")
147164
# Get the dates in the last and second-to-last weeks
148165
last_date <- as.Date(max(df[["reference_date"]], na.rm = TRUE))
@@ -190,7 +207,7 @@ low_case_count_diagnostic <- function(df) {
190207
))
191208

192209
any(
193-
ultimate_week_count < 10,
194-
penultimate_week_count < 10
210+
ultimate_week_count < low_count_threshold,
211+
penultimate_week_count < low_count_threshold
195212
)
196213
}

R/find_low_count_threshold.R

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#' Determine Low Case Count Threshold Based on Pathogen
2+
#'
3+
#' @inheritParams Config
4+
#'
5+
#' @return low_count_threshold An integer that reflects the value X where
6+
#' number ED visits < X in the past week and week prior results in an
7+
#' n_low_case_count flag for that pathogen-state pair
8+
#' @family diagnostics
9+
#' @export
10+
low_case_count_threshold <- function(low_case_count_thresholds, disease) {
11+
if (disease == "test") {
12+
low_count_threshold <- 10
13+
} else {
14+
low_count_threshold <- low_case_count_thresholds[[disease]]
15+
}
16+
return(low_count_threshold)
17+
}

R/pipeline.R

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,9 +241,16 @@ execute_model_logic <- function(config, input_dir, output_dir) {
241241
priors = config@priors,
242242
sampler_opts = config@sampler_opts
243243
)
244+
245+
low_count_threshold <- low_case_count_threshold(
246+
disease = config@disease,
247+
low_case_count_thresholds = config@low_case_count_thresholds
248+
)
249+
244250
diagnostics <- extract_diagnostics(
245251
fit = fit,
246252
data = cases_df,
253+
low_count_threshold = low_count_threshold,
247254
job_id = config@job_id,
248255
task_id = config@task_id,
249256
disease = config@disease,
@@ -271,6 +278,7 @@ execute_model_logic <- function(config, input_dir, output_dir) {
271278
data_path = empty_str_if_non_existent(config@data@path),
272279
model = config@model,
273280
disease = config@disease,
281+
low_case_count_threshold = low_count_threshold,
274282
geo_value = config@geo_value,
275283
report_date = config@report_date,
276284
production_date = config@production_date,

man/Config.Rd

Lines changed: 5 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/extract_diagnostics.Rd

Lines changed: 21 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/low_case_count_diagnostic.Rd

Lines changed: 15 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/low_case_count_threshold.Rd

Lines changed: 30 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)