11 Jul 12:30

etiennebacher

v0.12.0

8266d3f

datawizard 0.12.0

BREAKING CHANGES

The argument include_na in data_tabulate() and data_summary() has been
renamed into remove_na. Consequently, to mimic former behaviour, FALSE and
TRUE need to be switched (i.e. remove_na = TRUE is equivalent to the former
include_na = FALSE).
Class names for objects returned by data_tabulate() have been changed to
datawizard_table and datawizard_crosstable (resp. the plural forms,
*_tables), to provide a clearer and more consistent naming scheme.

CHANGES

data_select() can directly rename selected variables when a named vector
is provided in select, e.g. data_select(mtcars, c(new1 = "mpg", new2 = "cyl")).
data_tabulate() gains an as.data.frame() method, to return the frequency
table as a data frame. The structure of the returned object is a nested data
frame, where the first column contains name of the variable for which
frequencies were calculated, and the second column contains the frequency table.
demean() (and degroup()) now also work for cross-classified designs, or
more generally, for data with multiple grouping or cluster variables (i.e.
by can now specify more than one variable).

Assets 2

05 Jun 19:41

etiennebacher

v0.11.0

416fd42

datawizard 0.11.0

BREAKING CHANGES

Arguments named group or group_by are deprecated and will be removed
in a future release. Please use by instead. This affects the following
functions in datawizard (#502).
- data_partition()
- demean() and degroup()
- means_by_group()
- rescale_weights()
Following aliases are deprecated and will be removed in a future release (#504):
- get_columns(), use data_select() instead.
- data_find() and find_columns(), use extract_column_names() instead.
- format_text(), use text_format() instead.

CHANGES

recode_into() is more relaxed regarding checking the type of NA values.
If you recode into a numeric variable, and one of the recode values is NA,
you no longer need to use NA_real_ for numeric NA values.
Improved documentation for some functions.

BUG FIXES

data_to_long() did not work for data frame where columns had attributes
(like labelled data).

Assets 2

26 Mar 14:28

etiennebacher

v0.10.0

bf51817

datawizard 0.10.0

BREAKING CHANGES

The following arguments were deprecated in 0.5.0 and are now removed:
- in data_to_wide(): colnames_from, rows_from, sep
- in data_to_long(): colnames_to
- in data_partition(): training_proportion

NEW FUNCTIONS

data_summary(), to compute summary statistics of (grouped) data frames.
data_replicate(), to expand a data frame by replicating rows based on another
variable that contains the counts of replications per row.

CHANGES

data_modify() gets three new arguments, .at, .if and .modify, to modify
variables at specific positions or based on logical conditions.
data_tabulate() was revised and gets several new arguments: a weights
argument, to compute weighted frequency tables. include_na allows to include
or omit missing values from the table. Furthermore, a by argument was added,
to compute crosstables (#479, #481).

Assets 2

21 Dec 17:19

etiennebacher

v0.9.1

bae0ae6

0.9.1

datawizard 0.9.1

CHANGES

rescale() gains multiply and add arguments, to expand ranges by a given
factor or value.
to_factor() and to_numeric() now support class haven_labelled.

BUG FIXES

to_numeric() now correctly deals with inversed factor levels when
preserve_levels = TRUE.
to_numeric() inversed order of value labels when dummy_factors = FALSE.
convert_to_na() now preserves attributes for factors when drop_levels = TRUE.

Assets 2

15 Sep 10:46

etiennebacher

v0.9.0

faa2000

datawizard 0.9.0

NEW FUNCTIONS

row_means(), to compute row means, optionally only for the rows with at
least min_valid non-missing values.
contr.deviation() for sum-deviation contrast coding of factors.
means_by_group(), to compute mean values of variables, grouped by levels
of specified factors.
data_seek(), to seek for variables in a data frame, based on their
column names, variables labels, value labels or factor levels. Searching for
labels only works for "labelled" data, i.e. when variables have a label or
labels attribute.

CHANGES

recode_into() gains an overwrite argument to skip overwriting already
recoded cases when multiple recode patterns apply to the same case.
recode_into() gains an preserve_na argument to preserve NA values
when recoding.
data_read() now passes the encoding argument to data.table::fread().
This allows to read files with non-ASCII characters.
datawizard moves from the GPL-3 license to the MIT license.
unnormalize() and unstandardize() now work with grouped data (#415).
unnormalize() now errors instead of emitting a warning if it doesn't have the
necessary info (#415).

BUG FIXES

Fixed issue in labels_to_levels() when values of labels were not in sorted
order and values were not sequentially numbered.
Fixed issues in data_write() when writing labelled data into SPSS format
and vectors were of different type as value labels.
Fixed issue in recode_into() with probably wrong case number printed in the
warning when several recode patterns match to one case.
Fixed issue in recode_into() when original data contained NA values and
NA was not included in the recode pattern.
Fixed issue in data_filter() where functions containing a = (e.g. when
naming arguments, like grepl(pattern, x = a)) were mistakenly seen as
faulty syntax.
Fixed issue in empty_column() for strings with invalid multibyte strings.
For such data frames or files, empty_column() or data_read() no longer
fails.

Assets 2

18 Jun 14:20

etiennebacher

v0.8.0

4876060

datawizard 0.8.0

BREAKING CHANGES

The following re-exported functions from {insight} have now been removed:
object_has_names(), object_has_rownames(), is_empty_object(),
compact_list(), compact_character().
Argument na.rm was renamed to remove_na throughout {datawizard} functions.
na.rm is kept for backward compatibility, but will be deprecated and later
removed in future updates.
The way expressions are defined in data_filter() was revised. The filter
argument was replaced by ..., allowing to separate multiple expression with
a comma (which are then combined with &). Furthermore, expressions can now also be
defined as strings, or be provided as character vectors, to allow string-friendly
programming.

CHANGES

Weighted-functions (weighted_sd(), weighted_mean(), ...) gain a remove_na
argument, to remove or keep missing and infinite values. By default,
remove_na = TRUE, i.e. missing and infinite values are removed by default.
reverse_scale(), normalize() and rescale() gain an append argument
(similar to other data frame methods of transformation functions), to append
recoded variables to the input data frame instead of overwriting existing
variables.

NEW FUNCTIONS

rowid_as_column() to complement rownames_as_column() (and to mimic
tibble::rowid_to_column()). Note that its behavior is different from
tibble::rowid_to_column() for grouped data. See the Details section in the
docs.
data_unite(), to merge values of multiple variables into one new variable.
data_separate(), as counterpart to data_unite(), to separate a single
variable into multiple new variables.
data_modify(), to create new variables, or modify or remove existing
variables in a data frame.

MINOR CHANGES

to_numeric() for variables of type Date, POSIXct and POSIXlt now
includes the class name in the warning message.
Added a print() method for center(), standardize(), normalize() and
rescale().

BUG FIXES

standardize_parameters() now works when the package namespace is in the model
formula (#401).
data_merge() no longer yields a warning for tibbles when join = "bind".
center() and standardize() did not work for grouped data frames (of class
grouped_df) when force = TRUE.
The data.frame method of describe_distribution() returns NULL instead of
an error if no valid variable were passed (for example a factor variable with
include_factors = FALSE) (#421).

Assets 2

03 Apr 16:06

etiennebacher

v0.7.1

79d85e4

datawizard 0.7.1

BREAKING CHANGES

add_labs() was renamed into assign_labels(). Since add_labs() existed
only for a few days, there will be no alias for backwards compatibility.

NEW FUNCTIONS

labels_to_levels(), to use value labels of factors as their levels.

MINOR CHANGES

data_read() now checks if the imported object actually is a data frame (or
coercible to a data frame), and if not, no longer errors, but gives an
informative warning of the type of object that was imported.

BUG FIXES

Fix test for CRAN check on Mac OS arm64

Assets 2

22 Mar 17:04

etiennebacher

v0.7.0

6395a1d

datawizard 0.7.0

BREAKING CHANGES

In selection patterns, expressions like -var1:var3 to exclude all variables
between var1 and var3 are no longer accepted. The correct expression is
-(var1:var3). This is for 2 reasons:
- to be consistent with the behavior for numerics (-1:2 is not accepted but
  -(1:2) is);
- to be consistent with dplyr::select(), which throws a warning and only
  uses the first variable in the first expression.

NEW FUNCTIONS

recode_into(), similar to dplyr::case_when(), to recode values from one
or more variables into a new variable.
mean_sd() and median_mad() for summarizing vectors to their mean (or
median) and a range of one SD (or MAD) above and below.
data_write() as counterpart to data_read(), to write data frames into
CSV, SPSS, SAS, Stata files and many other file types. One advantage over
existing functions to write data in other packages is that labelled (numeric)
data can be converted into factors (with values labels used as factor levels)
even for text formats like CSV and similar. This allows exporting "labelled"
data into those file formats, too.
add_labs(), to manually add value and variable labels as attributes to
variables. These attributes are stored as "label" and "labels" attributes,
similar to the labelled class from the haven package.

MINOR CHANGES

data_rename() gets a verbose argument.
winsorize() now errors if the threshold is incorrect (previously, it provided
a warning and returned the unchanged data). The argument verbose is now
useless but is kept for backward compatibility. The documentation now contains
details about the valid values for threshold (#357).
In all functions that have arguments select and/or exclude, there is now
one warning per misspelled variable. The previous behavior was to have only one
warning.
Fixed inconsistent behaviour in standardize() when only one of the arguments
center or scale were provided (#365).
unstandardize() and replace_nan_inf() now work with select helpers (#376).
Added informative warning and error messages to reverse(). Furthermore, the
docs now describe the range argument more clearly (#380).
unnormalize() errors with unexpected inputs (#383).

BUG FIXES

empty_columns() (and therefore remove_empty_columns()) now correctly detects
columns containing only NA_character_ (#349).
Select helpers now work in custom functions when argument is called select
(#356).
Fix unexpected warning in convert_na_to() when select is a list (#352).
Fixed issue with correct labelling of numeric variables with more than nine
unique values and associated value labels.

Assets 2

14 Dec 17:48

etiennebacher

v0.6.5

33e96b8

datawizard 0.6.5

MAJOR CHANGES

Etienne Bacher is the new maintainer.

MINOR CHANGES

standardize(), center(), normalize() and rescale() can be used in
model formulas, similar to base::scale().
data_codebook() now includes the proportion for each category/value, in
addition to the counts. Furthermore, if data contains tagged NA values,
these are included in the frequency table.

BUG FIXES

center(x) now works correctly when x is a single value and either
reference or center is specified (#324).
Fixed issue in data_codebook(), which failed for labelled vectors when
values of labels were not in sorted order.

Assets 2

20 Nov 07:47

IndrajeetPatil

0.6.4

fb2e94b

datawizard 0.6.4

NEW FUNCTIONS

data_codebook(): to generate codebooks of data frames.
New functions to deal with duplicates: data_duplicated() (keep all duplicates,
including the first occurrence) and data_unique() (returns the data, excluding
all duplicates except one instance of each, based on the selected method).

MINOR CHANGES

.data.frame methods should now preserve custom attributes.
The include_bounds argument in normalize() can now also be a numeric
value, defining the limit to the upper and lower bound (i.e. the distance
to 1 and 0).
data_filter() now works with grouped data.

BUG FIXES

data_read() no longer prints message for empty columns when the data
actually had no empty columns.
data_to_wide() now drops columns that are not in id_cols (if specified),
names_from, or values_from. This is the behaviour observed in tidyr::pivot_wider().

Assets 2

Uh oh!

Releases: easystats/datawizard

datawizard 0.12.0

Uh oh!

datawizard 0.11.0

Uh oh!

datawizard 0.10.0

Uh oh!

0.9.1

datawizard 0.9.1

Uh oh!

datawizard 0.9.0

Uh oh!

datawizard 0.8.0

Uh oh!

datawizard 0.7.1

Uh oh!

datawizard 0.7.0

Uh oh!

datawizard 0.6.5

Uh oh!

datawizard 0.6.4

Uh oh!