Releases: njtierney/naniar
Digory and his Uncle Are Both in Trouble
New Features
-
Added
all_miss()/all_na()equivalent toall(is.na(x)) -
Added
any_complete()equivalent toall(complete.cases(x)) -
Added
any_miss()equivalent toanyNA(x) -
Added
common_na_numbersand finalisedcommon_na_strings- to provide a
list of commonly used NA values
#168 -
Added
miss_var_which, to lists the variable names with missings -
Added
as_shadow_upsetwhich gets the data into a format suitable for
plotting as anUpSetRplot:airquality %>% as_shadow_upset() %>% UpSetR::upset()
-
Added some imputation functions to assist with exploring missingness
structure and visualisation:impute_belowPerfoms as forshadow_shift, but performs on all columns.
This means that it imputes missing values 10% below the range of the
data (powered byshadow_shift), to facilitate graphical exloration of
the data. Closes #145
There are also scoped variants that work for specific named columns:
impute_below_at, and for columns that satisfy some predicate function:
impute_below_if.impute_mean, imputes the mean value, and scoped variants
impute_mean_at, andimpute_mean_if.
-
impute_belowandshadow_shiftgain argumentsprop_belowandjitter
to control the degree of shift, and also the extent of jitter. -
Added
complete_{case/var}_{pct/prop}, which complement
miss_{var/case}_{pct/prop}
#150 -
Added
unbind_shadowandunbind_dataas helpers to remove shadow columns
from data, and data from shadows, respectively. -
Added
is_shadowandare_shadowto determine if something contains a
shadow column. simimlar torlang::is_naandrland::are_na,is_shadow
this returns a logical vector of length 1, andare_shadowreturns a logical
vector of length of the number of names of a data.frame. This might be
revisited at a later point (seeany_shadeinadd_label_shadow). -
Aesthetics now map as expected in geom_miss_point(). This means you can write
things likegeom_miss_point(aes(colour = Month))and it works appropriately.
Fixed by Luke Smith in Pull request
#144, fixing
#137.
Minor Changes
-
miss_var_summaryandmiss_case_summarynow return useorder = TRUEby
default, so cases and variables with the most missings are presented in
descending order. Fixes #163 -
Changes for Visualisation:
- Changed the default colours used in
gg_miss_caseandgg_miss_varto
lorikeet purple (from ochRe package: https://github.com/ropenscilabs/ochRe) gg_miss_case- The y axis label is now ...
- Default presentation is with
order_cases = TRUE. - Gains a
show_pctoption to be consistent withgg_miss_var
#153
gg_miss_whichis rotated 90 degrees so it is easier to read variable namesgg_miss_fctuses a minimal theme and tilts the axis labels
#118.
- Changed the default colours used in
-
imported
is_naandare_nafromrlang. -
Added
common_na_strings, a list of commonNAvalues
#168. -
Added some detail on alternative methods for replacing with NA in the
vignette "replacing values with NA".
CRAN 0.1.0 Release "The Founding of naniar"
"The Founding of naniar the first version on CRAN! The name is taken from Chapter 9 of The Magician's Nephew. Below is the updated NEWS file
naniar 0.1.0 (2017/08/09) "The Founding of naniar"
=========================
- This is the first release of
naniaronto CRAN, updates tonaniarwill happen reasonably regularly after this approximately every 1-2 months
naniar 0.0.9.9995 (2017/08/07)
=========================
Name change
- After careful consideration, I have changed back to
naniar
Major Change
- three new functions :
miss_case_cumsum/miss_var_cumsum/replace_to_na - two new visualisations :
gg_var_cumsum&gg_case_cumsum
New Feature
group_byis now respected by the following functions:miss_case_cumsum()miss_case_summary()miss_case_table()miss_prop_summary()miss_var_cumsum()miss_var_run()miss_var_span()miss_var_summary()miss_var_table()
Minor changes
- Reviewed documentation for all functions and improved wording, grammar, and
style. - Converted roxygen to roxygen markdown
- updated vignettes and readme
- added a new vignette "naniar-visualisation", to give a quick overview of the visualisations provided with naniar.
- changed
label_missing*tolabel_missto be more consistent with the rest
of naniar - Add
pctandprophelpers (#78) - removed
miss_df_pct- this was literally the same aspct_missorprop_miss. - break larger files into smaller, more manageable files (#83)
gg_miss_vargets ashow_pctargument to show the percentage of missing values (Thanks Jennifer for the helpful feedback! :))
Minor changes
miss_var_summary&miss_case_summarynow have consistent output (one was ordered by n_missing, not the other).- prevent error in
miss_case_pct enquo_xis nowx(as adviced by Hadley)- Now has ByteCompile to TRUE
- add Colin to auth
narnia 0.0.9.9400 (2017/07/24)
=========================
new features
replace_to_nais a complement totidyr::replace_naand replaces a specified
value from a variable to NA.gg_miss_fctreturns a heatmap of the number of missings per variable for
each level of a factor. This feature was very kindly contributed by
Colin Fay.gg_miss_functions now return a ggplot object, which behave as such.
gg_miss_basic themes can be overriden with ggplot functions. This fix
was very kindly contributed by Colin Fay.- removed defunct functions as per #63
- made
add_*functions handle bare unqouted names where appropriate as per #61 - added tests for the
add_*family - got the svgs generated from vdiffr, thanks @karawoo!
breaking changes
- changed
geom_missing_point()togeom_miss_point(), to keep consistent with the rest of the functions innaniar.
narnia 0.0.8.9100 (2017/06/23)
=========================
new features
- updated datasets
brfssandtaoas per #59
narnia 0.0.7.9992 (2017/06/22)
=========================
new features
-
add_label_missings() -
add_label_shadow() -
cast_shadow() -
cast_shadow_shift() -
cast_shadow_shift_label() -
added github issue / contribution / pull request guides
-
tsgeneric functions are nowmiss_var_spanandmiss_var_run, andgg_miss_spanand work ondata.frame's, as opposed to justtsobjects. -
add_shadow_shift()adds a column of shadow_shifted values to the current dataframe, adding "_shift" as a suffix -
cast_shadow()- acts likebind_shadow()but allows for specifying which columns to add -
shadow_shiftnow has a method for factors - powered byforcats::fct_explicit_na()#3
bug fixes
- shadow_shift.numeric works when there is no variance (#37)
name changes
- changed
is_nafunction tolabel_na - renamed most files to have
tidy-miss-[topic] gg_missing_*is changed togg_miss_*to fit with other syntax
Removed functions
- Removed old functions
miss_cat,shadow_dfandshadow_cat, as they are no longer needed, and have been superceded bylabel_missing_2d,as_shadow, andis_na.
minor changes
- drastically reduced the size of the pedestrian dataset, consider 4 sensor locations, just for 2016.
New features
- New dataset,
pedestrian- contains hourly counts of pedestrians - First pass at time series missing data summaries and plots:
miss_ts_run(): return the number of missings / complete in a single runmiss_ts_summary(): return the number of missings in a given time periodgg_miss_ts(): plot the number of missings in a given time period
Name changes
- renamed package from
naniartonarnia- I had to explain the spelling a few times when I was introducing the package and I realised that I should change the name. Fortunately it isn't on CRAN yet.
naniar 0.0.6.9100 (2017/03/21)
=========================
- Added
prop_missand the complementprop_complete. Wheren_missreturns the number of missing values,prop_missreturns the proportion of missing values. Likewise,prop_completereturns the proportion of complete values.
Defunct functions
- As stated in 0.0.5.9000, to address Issue #38, I am moving towards the format miss_type_value/fun, because it makes more sense to me when tabbing through functions.
The left hand side functions have been made defunct in favour of the right hand side.
- percent_missing_case() --> miss_case_pct()
- percent_missing_var() --> miss_var_pct()
- percent_missing_df() --> miss_df_pct()
- summary_missing_case() --> miss_case_summary()
- summary_missing_var() --> miss_var_summary()
- table_missing_case() --> miss_case_table()
- table_missing_var() --> miss_var_table()
naniar 0.0.5.9000 (2016/01/08)
=========================
Deprecated functions
- To address Issue #38, I am moving towards the format miss_type_value/fun, because it makes more sense to me when tabbing through functions.
miss_*= I want to explore missing valuesmiss_case_*= I want to explore missing casesmiss_case_pct= I want to find the percentage of cases containing a missing valuemiss_case_summary= I want to find the number / percentage of missings in each case
miss_case_table= I want a tabulation of the number / percentage of cases missing
This is more consistent and easier to reason with.
Thus, I have renamed the following functions:
- percent_missing_case() --> miss_case_pct()
- percent_missing_var() --> miss_var_pct()
- percent_missing_df() --> miss_df_pct()
- summary_missing_case() --> miss_case_summary()
- summary_missing_var() --> miss_var_summary()
- table_missing_case() --> miss_case_table()
- table_missing_var() --> miss_var_table()
These will be made defunct in the next release, 0.0.6.9000 ("The Wood Between Worlds").
naniar 0.0.4.9000 (2016/12/31)
=========================
New features
n_completeis a complement ton_miss, and counts the number of complete values in a vector, matrix, or dataframe.
Bug fixes
shadow_shiftnow handles cases where there is only 1 complete value in a vector.
Other changes
- added much more comprehensive testing with
testthat.
naniar 0.0.3.9901 (2016/12/18)
=========================
After a burst of effort on this package I have done some refactoring and thought hard about where this package is going to go. This meant that I had to make the decision to rename the package from ggmissing to naniar. The name may strike you as strange but it reflects the fact that there are many changes happening, and that we will be working on creating a nice utopia (like Narnia by CS Lewis) that helps us make it easier to work with missing data
New Features (under development)
-
add_n_missandadd_prop_missare helpers that add columns to a dataframe containing the number and proportion of missing values. An example has been provided to use decision trees to explore missing data structure as in Tierney et al -
geom_miss_point()now supports transparency, thanks to @seasmith (Luke Smith) -
more shadows. These are mainly around
bind_shadowandgather_shadow, which are helper functions to assist with creating
Bug fixes
-
geom_missing_point()broke after the new release of ggplot2 2.2.0, but this is now fixed by ensuring that it inherits from GeomPoint, rather than just a new Geom. Thanks to Mitchell O'hara-Wild for his help with this. -
missing data summaries
table_missing_varandtable_missing_casealso now return more sensible numbers and variable names. It is possible these function names will change in the future, as these are kind of verbose. -
semantic versioning was incorrectly entered in the DESCRIPTION file as 0.2.9000, so I changed it to 0.0.2.9000, and then to 0.0.3.9000 now to indicate the new changes, hopefully this won't come back to bite me later. I think I accidentally did this with visdat at some point as well. Live and learn.
Other changes
-
gathered related functions into single R files rather than leaving them in
their own. -
correctly imported the
%>%operator from magrittr, and removed a lot of chaff around@importFrom- really don't need to use@importFromthat often.
ggmissing 0.0.2.9000 (2016/07/29)
=========================
New Feature (und...
The Wrong Door
naniar 0.0.4.9000 (2016/12/31)
New features
n_completeis a complement ton_miss, and counts the number of complete values in a vector, matrix, or dataframe.
Bug fixes
shadow_shiftnow handles cases where there is only 1 complete value in a vector.
Other changes
- added much more comprehensive testing with
testthat.
naniar 0.0.3.9901 (2016/12/18)
New features
add_n_missandadd_prop_missare helpers that add columns to a dataframe containing the number and proportion of missing values. An example has been provided to use decision trees to explore missing data structure as in Tierney et algeom_miss_point()now supports transparency, thanks to @seasmith (Luke Smith)
naniar 0.0.3.9000 (2016/12/18)
After a burst of effort on this package I have done some refactoring and thought hard about where this package is going to go. This meant that I had to make the decision to rename the package from ggmissing to naniar. The name may strike you as strange but it reflects the fact that there are many changes happening, and that we will be working on creating a nice utopia (like Narnia by CS Lewis) that helps us make it easier to work with missing data
New Features (under development)
- more shadows. These are mainly around
bind_shadowandgather_shadow, which are helper functions to assist with creating
Bug fixes
geom_missing_point()broke after the new release of ggplot2 2.2.0, but this is now fixed by ensuring that it inherits from GeomPoint, rather than just a new Geom. Thanks to Mitchell O'hara-Wild for his help with this.- missing data summaries
table_missing_varandtable_missing_casealso now return more sensible numbers and variable names. It is possible these function names will change in the future, as these are kind of verbose. - semantic versioning was incorrectly entered in the DESCRIPTION file as 0.2.9000, so I changed it to 0.0.2.9000, and then to 0.0.3.9000 now to indicate the new changes, hopefully this won't come back to bite me later. I think I accidentally did this with visdat at some point as well. Live and learn.
Other changes
- gathered related functions into single R files rather than leaving them in
their own. - correctly imported the
%>%operator from magrittr, and removed a lot of chaff around@importFrom- really don't need to use@importFromthat often.
ggmissing 0.0.2.9000 (2016/07/29)
New Feature (under development)
geom_missing_point()now works in a way that we expect! Thanks to Miles McBain for working out how to get this to work.
ggmissing 0.0.1.9000 (2016/07/29)
New Feature (under development)
- tidy summaries for missing data:
percent_missing_dfreturns the percentage of missing data for a data.framepercent_missing_varthe percentage of variables that contain missing valuespercent_missing_casethe percentage of cases that contain missing values.table_missing_vartable of missing information for variablestable_missing_casetable of missing information for casessummary_missing_varsummary of missing information for variables (counts, percentages)summary_missing_casesummary of missing information for variables (counts, percentages)
- gg_missing_col: plot the missingness in each variable
- gg_missing_row: plot the missingness in each case
- gg_missing_which: plot which columns contain missing data.