@tinghf alerted me that this block of code breaks on the simulated data because sample isn't a valid column.
|
# filter out nested PCR targets to retain high-level target only |
|
# Flu A |
|
keepTargetList <- unique(db$sample[db$pathogen %in% c("Flu_A_H1","Flu_A_H3")]) |
|
dropTargetList <- unique(db$sample[db$pathogen %in% c("Flu_A_pan")]) |
|
|
|
dropSampleList <- intersect(dropTargetList,keepTargetList) |
|
|
|
db <- db %>% filter( !(sample %in% dropSampleList & db$pathogen %in% c("Flu_A_pan"))) |
|
|
|
# enterovirus |
|
keepTargetList <- unique(db$sample[db$pathogen %in% c("EV_D68")]) |
|
dropTargetList <- unique(db$sample[db$pathogen %in% c("EV_pan")]) |
|
|
|
dropSampleList <- intersect(dropTargetList,keepTargetList) |
|
|
|
db <- db %>% filter( !(sample %in% dropSampleList & db$pathogen %in% c("EV_pan"))) |
The short-term fix is to wrap this block with an if(source == 'production') as in
if(source == 'production'){
# filter out nested PCR targets to retain high-level target only
# Flu A
keepTargetList <- unique(db$sample[db$pathogen %in% c("Flu_A_H1","Flu_A_H3")])
dropTargetList <- unique(db$sample[db$pathogen %in% c("Flu_A_pan")])
dropSampleList <- intersect(dropTargetList,keepTargetList)
db <- db %>% filter( !(sample %in% dropSampleList & db$pathogen %in% c("Flu_A_pan")))
# enterovirus
keepTargetList <- unique(db$sample[db$pathogen %in% c("EV_D68")])
dropTargetList <- unique(db$sample[db$pathogen %in% c("EV_pan")])
dropSampleList <- intersect(dropTargetList,keepTargetList)
db <- db %>% filter( !(sample %in% dropSampleList & db$pathogen %in% c("EV_pan")))
}
Long term, we should keep the simulated data synchronized with the necessary test cases. You can see the workflow pattern to do that in commits to the simulated-data repo: https://github.com/seattleflu/simulated-data/commits/master.
- introduce a script that makes a specific format change to the data without breaking other columns (unless this is on purpose!)
- change the data
- commit both together explaining the change.
@tinghf alerted me that this block of code breaks on the simulated data because
sampleisn't a valid column.incidence-mapper/dbViewR/R/selectFromDB.R
Lines 138 to 153 in 97ad7e2
The short-term fix is to wrap this block with an
if(source == 'production')as inLong term, we should keep the simulated data synchronized with the necessary test cases. You can see the workflow pattern to do that in commits to the simulated-data repo: https://github.com/seattleflu/simulated-data/commits/master.