Skip to content

Commit 11079f0

Browse files
committed
release janitor 2.2.0 🎉
1 parent b54f9e5 commit 11079f0

File tree

3 files changed

+24
-23
lines changed

3 files changed

+24
-23
lines changed

DESCRIPTION

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Package: janitor
22
Title: Simple Tools for Examining and Cleaning Dirty Data
3-
Version: 2.1.0.9000
3+
Version: 2.2.0
44
Authors@R: c(person("Sam", "Firke", email = "[email protected]", role = c("aut", "cre")),
55
person("Bill", "Denney", email = "[email protected]", role = "ctb"),
66
person("Chris", "Haid", email = "[email protected]", role = "ctb"),
@@ -9,14 +9,12 @@ Authors@R: c(person("Sam", "Firke", email = "[email protected]", role = c("
99
person("Jonathan", "Zadra", email = "[email protected]", role = "ctb"))
1010
Description: The main janitor functions can: perfectly format data.frame column
1111
names; provide quick counts of variable combinations (i.e., frequency
12-
tables and crosstabs); and isolate duplicate records. Other janitor functions
12+
tables and crosstabs); and explore duplicate records. Other janitor functions
1313
nicely format the tabulation results. These tabulate-and-report functions
1414
approximate popular features of SPSS and Microsoft Excel. This package
1515
follows the principles of the "tidyverse" and works well with the pipe function
1616
%>%. janitor was built with beginning-to-intermediate R users in mind and is
17-
optimized for user-friendliness. Advanced R users can already do everything
18-
covered here, but with janitor they can do it faster and save their thinking for
19-
the fun stuff.
17+
optimized for user-friendliness.
2018
URL: https://github.com/sfirke/janitor,
2119
https://sfirke.github.io/janitor/
2220
BugReports: https://github.com/sfirke/janitor/issues

NEWS.md

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,27 @@
1-
# janitor 2.1.0.9000 (unreleased, under development)
1+
# janitor 2.2.0 (2023-02-01)
22

33
## Breaking changes
44

5+
These are all minor breaking changes resulting from enhancements and are not expected to affect the vast majority of users.
6+
57
* A new `...` argument was added to `row_to_names()`, preceding the `remove_row` argument, as part of the new `find_header()` functionality. If code previously used `remove_row` as an unnamed argument, it will now error. If code previously used the unsupported behavior of passing anything other than `TRUE` or `FALSE` to `remove_row`, unexpected results may occur.
68

79
* Microsoft Excel incorrectly has a leap day on 29 February 1900 (see https://docs.microsoft.com/en-us/office/troubleshoot/excel/wrongly-assumes-1900-is-leap-year). `excel_numeric_to_date()` did not account for this error, and now it does. Dates returned from `excel_numeric_to_date()` that precede 1 March 1900 will now be one day later compared to previous versions (i.e. what was 1 Feb 1900 is now 2 Feb 1900), and dates that Excel presents as 29 Feb 1900 will become `as.POSIXct(NA)`. (#423, thanks **@billdenney** for fixing)
810

911
* A minor breaking change is that the time zone is now always set for `excel_numeric_to_date()` and `convert_date()`. The default timezone is `Sys.timezone()`, previously it was an empty string (`""`). (#422, thanks **@billdenney** for fixing)
1012

11-
* A minor breaking change affects `make_clean_names()` (and therefore `clean_names()`). `make_clean_names()` now uses the `unique_sep` argument from `snakecase::to_any_case()` to handle de-duplication of names. The incremental suffix counter is now one less than in the past (i.e., a de-duplicated variable with a suffix of `_2` becomes `_1`, `_3` becomes `_2`, etc.). This change also results in a new feature and the ability to allow for duplicate names by setting `unique_sep = NULL`. (#495, thanks **@JasonAizkalns** for fixing and **@billdenney** and **@sfirke** for the guidance)
12-
1313
* `get_dupes()` results are now sorted first by descending order of `dupe_count`, then alphabetically by sorting variables. (#493)
1414

1515
* There are several minor breaking changes resulting from enhancements to `adorn_ns()`:
16-
* The addition of the new argument `format_func` means that previous calls relying on `,,,` as shorthand to get to the `...` column selection argument may now require an extra comma
16+
* The addition of the new argument `format_func` means that previous calls relying on `,,,` as shorthand to get to the `...` column selection argument may now require an extra comma.
1717
* `adorn_ns()` now defaults to displaying numbers of >3 digits with `big.mark = ","`, as part of the default value of the new `format_func` argument. E.g., `1234` is now `1,234`.
1818
* `adorn_ns()` no longer prints leading whitespace when `position = "front"` - this is not a visible change in the printed result and it would be rare that this affects any code.
1919

2020
* When the first column of the data.frame input to `adorn_totals()` is a factor and a totals row is added to the bottom, that column now remains a factor, with "Total" or other user-specified totals name added to its factor levels (#494).
2121

2222
## New features
2323

24-
* `row_to_names()` now has a new helper function, `find_header()` to help find the row that contains the names. It can be used by passing `row_number="find_header"`, and see the documentation of `row_to_names()` and `find_header()` for more examples. (fix #429)
24+
* `row_to_names()` now has a new helper function, `find_header()` to help find the row that contains the names. It can be used by passing `row_number="find_header"`. See the documentation of `row_to_names()` and `find_header()` for more examples. (fix #429)
2525

2626
* `remove_empty()` has a new argument, `cutoff` which allows rows or columns to be removed if at least the `cutoff` fraction of the data are missing. (fix #446, thanks to **@jzadra** for suggesting the feature and **@billdenney** for fixing)
2727

@@ -37,13 +37,10 @@
3737

3838
## Minor features
3939

40-
* Some warning messages now have classes so that they can be specifically suppressed with suppressWarnings(..., class="the_class_to_suppress"). To find the class of a warning you typically must look at the code where the error is occurring. (#452, thanks to **@mgacc0** for suggesting and **@billdenney** for fixing)
41-
42-
* `make_clean_names()` (and therefore `clean_names()`) issues a warning if the mu or micro symbol is in the names and it is not or may not be handled by a `replace` argument value. (#448, thanks **@IndrajeetPatil** for reporting and **@billdenney** for fixing) The rationale is that standard transliteration would convert "[mu]g" to "mg" when it would be more typically be converted to "ug" for use as a unit. A new, unexported constant (janitor:::mu_to_u) was added to help with mu to "u" replacements.
40+
* `make_clean_names()` (and therefore `clean_names()`) issues a warning if the mu or micro symbol is in the names and it is not or may not be handled by a `replace` argument value. (#448, thanks **@IndrajeetPatil** for reporting and **@billdenney** for fixing) The rationale is that standard transliteration would convert `"[mu]g"` to `"mg"` when it would be more typically be converted to `"ug"` for use as a unit. A new, unexported constant (janitor:::mu_to_u) was added to help with mu to "u" replacements.
4341

4442
* `excel_numeric_to_date()` now warns when times are converted to `NA` due to hours that do not exist because of daylight savings time (fix #420, thanks **@Geomorph2** for reporting and **@billdenney** for fixing). It also warns when inputs are not positive, since Excel only supports values down to 1 (#423).
4543

46-
4744
* If a `tabyl()` or similar data.frame is sorted (e.g., with `dplyr::arrange()`), then has `adorn_totals()` and/or `adorn_percentages()` called on it, followed by `adorn_ns()`, the Ns will be sorted correctly to match the tabyl they're being adorned on. (fix #407)
4845

4946
* `clean_names()` now supports all object types that have either names or dimnames (#481, @DanChaltiel).
@@ -54,9 +51,11 @@
5451

5552
* `make_clean_names()` now allows for duplicate names to be returned by specifying `TRUE` to the new `allow_dupes` argument (#495, @JasonAizkalns).
5653

54+
* Some warning messages now have classes so that they can be specifically suppressed with `suppressWarnings(..., class="the_class_to_suppress")`. To find the class of a warning you typically must look at the code where the error is occurring. (#452, thanks to **@mgacc0** for suggesting and **@billdenney** for fixing)
55+
5756
## Bug fixes
5857

59-
* `adorn_percentages()` was refactored for compatibility with `dplyr` package versions > 1.0.99 (#490)
58+
* `adorn_percentages()` was refactored for compatibility with `dplyr` package versions >= 1.1.0 (#490)
6059

6160
* When a numeric variable is supplied as the 2nd variable (column) or 3rd variable (list) of a `tabyl`, the resulting columns or list are now sorted in numeric order, not alphabetic. (#438, thanks **@daaronr** for reporting and **@mattroumaya** for fixing)
6261

cran-comments.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,30 @@
11
# Submission
2-
2020-12-28
2+
2023-02-01
33

44
## Submission summary
5-
An accumulation of small enhancements and bug fixes. No breaking changes.
5+
An accumulation of enhancements and bug fixes. Breaking changes only for edge cases.
6+
7+
Notably, this fixes the current test failures on CRAN for this package, resulting from
8+
changes introduced in the latest version of the dplyr package.
69

710
### Test environments
811

912
#### Windows
10-
* Windows 10 with R-release 4.0.3 (local)
11-
* Windows 10 with R Under development (unstable) (2020-12-19 r79650) via win-builder, checked 2020-12-28
13+
* Windows 10 with R-release 4.2.2 (local)
14+
* Windows Server 2022 x64 (build 20348) with R Under development (unstable) (2023-01-31 r83741 ucrt) via win-builder, checked 2023-02-01
1215

1316
#### Linux
14-
* ubuntu 20.04 R-release 4.0.3 (Github CI)
15-
* ubuntu 20.04 R-devel R Under development (unstable) (2020-12-28) (Github CI)
17+
* ubuntu 22.04 R-release 4.2.2 (Github CI)
18+
* ubuntu 22.04 R-devel R Under development (unstable) (2023-02-01) (Github CI)
19+
* ubuntu 22.04 R-oldrel 4.1.3 (Github CI)
1620

1721
#### Mac
18-
* Mac OS with R-release (Github CI)
22+
* Mac OS 12.6.2 with R-release 4.2.2 (Github CI)
1923

2024
### R CMD check results
2125
0 errors | 0 warnings | 0 notes
2226

2327
### Downstream dependencies
2428
This does not negatively affect downstream dependencies.
2529

26-
I ran a revdepcheck, it succeeded for 30 packages and I manually investigated the others to verify that errors were the result of time-outs and that the janitor changes do not affect those packages.
30+
revdepcheck passed for 101 of 101 packages (98 from CRAN, 3 from bioconductor).

0 commit comments

Comments
 (0)