Simulated data and real data workflow have diverged too far and that affects testing

@tinghf  alerted me that this block of code breaks on the simulated data because `sample` isn't a valid column. 

https://github.com/seattleflu/incidence-mapper/blob/97ad7e23fc0d2b6a7db760fa3e29f1996e782721/dbViewR/R/selectFromDB.R#L138-L153

The short-term fix is to wrap this block with an `if(source == 'production') ` as in
```
if(source == 'production'){

# filter out nested PCR targets to retain high-level target only
  # Flu A
  keepTargetList <- unique(db$sample[db$pathogen %in% c("Flu_A_H1","Flu_A_H3")])
  dropTargetList <- unique(db$sample[db$pathogen %in% c("Flu_A_pan")])
  
  dropSampleList <- intersect(dropTargetList,keepTargetList)
  
  db <- db %>% filter( !(sample %in% dropSampleList & db$pathogen %in% c("Flu_A_pan")))
  
  # enterovirus
  keepTargetList <- unique(db$sample[db$pathogen %in% c("EV_D68")])
  dropTargetList <- unique(db$sample[db$pathogen %in% c("EV_pan")])
  
  dropSampleList <- intersect(dropTargetList,keepTargetList)
  
  db <- db %>% filter( !(sample %in% dropSampleList & db$pathogen %in% c("EV_pan")))
}
```

Long term, we should keep the simulated data synchronized with the necessary test cases.   You can see the workflow pattern to do that in commits to the simulated-data repo: https://github.com/seattleflu/simulated-data/commits/master.  
- introduce a script that makes a specific format change to the data without breaking other columns (unless this is on purpose!)
- change the data
- commit both together explaining the change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simulated data and real data workflow have diverged too far and that affects testing #114

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	# filter out nested PCR targets to retain high-level target only
	# Flu A
	keepTargetList <- unique(db$sample[db$pathogen %in% c("Flu_A_H1","Flu_A_H3")])
	dropTargetList <- unique(db$sample[db$pathogen %in% c("Flu_A_pan")])

	dropSampleList <- intersect(dropTargetList,keepTargetList)

	db <- db %>% filter( !(sample %in% dropSampleList & db$pathogen %in% c("Flu_A_pan")))

	# enterovirus
	keepTargetList <- unique(db$sample[db$pathogen %in% c("EV_D68")])
	dropTargetList <- unique(db$sample[db$pathogen %in% c("EV_pan")])

	dropSampleList <- intersect(dropTargetList,keepTargetList)

	db <- db %>% filter( !(sample %in% dropSampleList & db$pathogen %in% c("EV_pan")))

Simulated data and real data workflow have diverged too far and that affects testing #114

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions