An Unexpected Meeting
New Feature
-
Add custom label support for missings and not missings with functions
add_label_missingsandadd_label_shadow()andadd_any_miss(). So you can now do `add_label_missings(data, missing = "custom_missing_label", complete = "custom_complete_label") -
impute_median()and scoped variants -
any_shade()returns a logical TRUE or FALSE depending on if there are anyshadevalues -
nabular()an alias forbind_shadow()to tie thenabularterm into the work. -
is_nabular()checks if input is nabular. -
geom_miss_point()now gains the arguments fromshadow_shift()/impute_below()for altering the amount ofjitterand proportion below (prop_below). -
Added two new vignettes, "Exploring Imputed Values", and "Special Missing Values"
-
miss_var_summaryandmiss_case_summarynow no longer provide the
cumulative sum of missingness in the summaries - this summary can be added back
to the data with the optionadd_cumsum = TRUE. #186
- Added
gg_miss_upsetto replace workflow of:data %>% as_shadow_upset() %>% UpSetR::upset()
Major Change
recode_shadownow works! This function allows you to recode your missing
values into special missing values. These special missing values are stored in
the shadow part of the dataframe, which ends in_NA.- implemented
shadewhere appropriate throughout naniar, and also added
verifiers,is_shade,are_shade,which_are_shade, and removedwhich_are_shadow.
as_shadowandbind_shadownow return data of classshadow. This will
feed intorecode_shadowmethods for flexibly adding new types of missing data.- Note that in the future
shadowmight be changed tonabbleor something similar.
Minor feature
- Functions
add_label_shadow()andadd_label_missings()gain arguments so you can only label according to the missingness / shadowy-ness of given variables. - new function
which_are_shadow(), to tell you which values are shadows. - new function
long_shadow(), which converts data in shadow/nabular form into a long format suitable for plotting. Related to #165 - Added tests for
miss_scan_count
Minor Changes
gg_miss_upsetgets a better default presentation by ordering by the largest
intersections, and also an improved error message when data with only 1 or no
variables have missing values.shadow_shiftgains a more informative error message when it doesn't know the class.- Changed
common_na_stringto include escape characters for "?", "", "." so
that if they are used in replacement or searching functions they don't return
the wildcard results from the characters "?", "", and ".". miss_case_tableandmiss_var_tablenow has final column namespct_vars,
andpct_casesinstead ofpct_miss- fixes #178.
Breaking Changes
- Deprecated old names of the scalar missingness summaries, in favour of a more
consistent syntax #171. The old the and new are:
| old_names | new_names |
|---|---|
miss_case_pct |
pct_miss_case |
miss_case_prop |
prop_miss_case |
miss_var_pct |
pct_miss_var |
miss_var_prop |
prop_miss_var |
complete_case_pct |
pct_complete_case |
complete_case_prop |
prop_complete_case |
complete_var_pct |
pct_complete_var |
complete_var_prop |
prop_complete_var |
These old names will be made defunct in 0.5.0, and removed completely in 0.6.0.
impute_belowhas changed to be an alias ofshadow_shift- that is it operates on a single vector.impute_below_alloperates on all columns in a dataframe (as specified in #159)
Bug fix
- Ensured that
miss_scan_countactuallyreturn'd something. gg_miss_var(airquality)now prints the ggplot - a typo meant that this did not print the plot