Current Behavior
When using naniar's special missing values (via bind_shadow() and recode_shadow()), the missing value types are not propagated through dplyr operations like mutate(). This means we lose information about why a result is NA when performing calculations.
I am aware that this is probably out of scope of what naniar can do, but do you happen to have a suggested approach here?
Example
Here's a minimal reproducible example where the result is completely uninformed about why it is NA.
library(naniar)
library(tidyverse)
tbl <- tibble(val1 = c('12', '*', 'x'), val2 = c('1', '2', 'x'))
cols <- c('val1', 'val2')
tbl_shadow <- tbl %>%
bind_shadow() %>%
recode_shadow(val1 = .where(val1 == '*' ~ "*", val1 == 'x' ~ 'x')) %>%
recode_shadow(val2 = .where(val2 == '*' ~ "*", val2 == 'x' ~ 'x')) %>%
replace_with_na(replace = setNames(lapply(cols, function(x) c('*', 'x')), cols)) %>%
mutate(across(all_of(cols), as.numeric))
tbl_shadow %>%
mutate(result = val1 / val2)
#> # A tibble: 3 × 5
#> val1 val2 val1_NA val2_NA result
#> <dbl> <dbl> <fct> <fct> <dbl>
#> 1 12 1 !NA !NA 12
#> 2 NA 2 NA_* !NA NA
#> 3 NA NA NA_x NA_x NA
Created on 2024-12-16 with reprex v2.1.1
Current Behavior
When using
naniar's special missing values (viabind_shadow()andrecode_shadow()), the missing value types are not propagated throughdplyroperations likemutate(). This means we lose information about why a result is NA when performing calculations.I am aware that this is probably out of scope of what naniar can do, but do you happen to have a suggested approach here?
Example
Here's a minimal reproducible example where the result is completely uninformed about why it is NA.
Created on 2024-12-16 with reprex v2.1.1