Releases: rstudio/pointblank
v0.12.3
-
We now support validation checks of Oracle tables via ODBC (#462). (#644, @pachadotdev)
-
Redshift table support has now been fixed (#538). (#623, #643, @pachadotdev and @hfrick)
-
Added the
na_passargument tocol_vals_expr()for finer control ofNAvalues. Previously,NAs were ignored, but now they are caught as failures with the defaultna_pass = FALSE. As a safeguard, if an expression generatesNAvalues whilena_passis not explicitly supplied, a warning is thrown (#616). (#617) -
Added an infix resolution for
log4r_step()'smessageargument (#654). (#656, @alexpaynter) -
Fixed the issue where an agent would auto-generating a table label that was too long (truncation occurs now) (#613). (#614)
-
Fixed problem where agents would not search the formula environment when materializing
~ tbl(#598). (#599) -
info_columns()now warns more informatively when no columns are selected (#344). (#589). -
write_yaml()errors more informatively now when atblvalue is incompatible for YAML-writing (#596). (#597) -
The
yaml_agent_string()function now returns the yaml string (#609). (#610) -
The
col_vals_regex()function now supports Perl regexes (#606). (#608) -
We now use a safeguard to fall back to the original string as-is if glue interpolation fails (useful in cases where a user-provided regex like
{1,2}is to be preservedin in autobrief but triggers an error by {glue}) (#600). (#601) -
Data extracts for
rows_distinct()/rows_complete()now preserve all columns, not just the ones tested (#475). (#588, #591) -
The
briefargument of validation functions now also supports{glue}syntax. (#587) -
Validation step
briefs correctly recycle to match expanded steps. (#564) -
Performed extensive refactoring of internal dplyr and ggplot2 code. (#579, #582, #583, @olivroy)
-
Rebuilt translation table and added support for ordered factors in
scan_data(). (#580, @olivroy) -
Modernized test infrastructure and removed deprecated testthat features. (#577, @olivroy)
-
The README.md has undergone some refinements. (#618)
v0.12.2
This release provides a few minor improvements along with many bug fixes.
-
New argument
extract_tbl_checkedadded tointerrogate(). WhenFALSE, the$tbl_checkedcolumn from the validation set will be dropped before returning the agent. This may be helpful in reducing object size for large agents (#542). (#554) -
The new argument
na_rminsnip_list()suppresses anyNAvalues so that they won't included in the snippet's list of items (#547). (#556) -
Improved readability of error messages rendered as tooltips in the agent report. (#543)
-
col_vals_expr()shows used columns in the agent report when interrogated. (#570) -
Improved the matching of rows between
agent$validation_stepand the rows of the agent report (#563). (#565) -
Functions accepting
...now userlang::list2(), enabling dynamic dots. For example, a multiagent can now be constructed from alist()of agents usingcreate_multiagent(!!!list_of_agents)(#552). (#553) -
Fixed bug with non-standard column names in some validation functions (#545, #546). (#555)
-
Fixed a regression in
col_vals_*()functions, wherevars("col")was evaluating to the string"col". Behavior ofvars("col")is now aligned back withvars(col)- both evaluate to the column namecol. (#535) -
Problems arising from comparing
columnsto avalueof different class (for example, comparing a datetime column to a date valueSys.Date()instead of another datetime valueSys.time()) are now signalled appropriately atinterrogate()(#536, #537). (#539) -
Fixed bug in
has_columns()failing to detect non-existing columns when supplied as a character vector. (#540) -
Replace uses of
crayon::make_style()withcli::make_ansi_style(), removing thecrayondependency. (#559, thanks @olivroy!) -
Use
rlang::check_installed()to perform checks of optional package installs. (#559, @olivroy) -
Modernized CI workflows with dedicated linting action. (#560, @olivroy)
-
Avoid unwanted equation formatting in agent report arising from arbitrary
"$"characters (#561). (#562)
v0.12.1
-
Ensured that the column string is a symbol before constructing the expression for the
col_vals_*()functions. -
No longer resolve columns with tidyselect when the target table cannot be materialized.
-
Relaxed tests on tidyselect error messages.
v0.12.0
New features
-
Complete
{tidyselect}support for thecolumnsargument of all validation functions, as well as inhas_columns()andinfo_columns. Thecolumnsargument can now take familiar column-selection expressions as one would use insidedplyr::select(). This also begins a process of deprecation:columns = vars(...)will continue to work, butc()now supersedesvars().- If passing an external vector of column names, it should be wrapped in
all_of().
-
The
labelargument of validation functions now exposes the following string variables via{glue}syntax:"{.step}": The validation step name"{.col}": The current column name"{.seg_col}": The current segment's column name"{.seg_val}": The current segment's value/group
These dynamic values may be useful for validations that get expanded into multiple steps.
-
interrogate()gains two new options for printing progress in the console output:progress: Whether interrogation progress should be printed to the console (TRUEfor interactive sessions, same as before)show_step_label: Whether each validation step's label value should be printed alongside the progress.
Minor improvements and bug fixes
-
Fixes issue with rendering reports in Quarto HTML documents.
-
When no columns are returned from a
{tidyselect}expression incolumns, the agent's report now displays the originally supplied expression instead of simply blank (e.g., increate_agent(small_table) |> col_vals_null(matches("z"))). -
Fixes issue with the hashing implementation to improve performance and alignment of validation steps in the multiagent.
v0.11.4
- Fixes issue with gt
0.9.0compatibility.
v0.11.3
- Fixes issue with tables not rendering due to interaction with the gt package.
v0.11.2
- Internal changes were made to ensure compatibility with an in-development version of R.
v0.11.1
- Updated all help files to pass HTML validation.
v0.11.0
New features
-
The
row_count_match()function can now match the count of rows in the target table to a literal value (in addition to comparing row counts to a secondary table). -
The analogous
col_count_match()function was added to compare column counts in the target table to a secondary table, or, to match on a literal value. -
Substitution syntax has been added to the
tbl_store()function with{{ <name> }}. This is a great way to make table-prep more concise, readable, and less prone to errors. -
The
get_informant_report()has been enhanced with morewidthoptions. Aside from the"standard"and"small"sizes we can now supply any pixel- or percent-based width to precisely size the reporting. -
Added support for validating data in BigQuery tables.
Documentation
- All functions in the package now have better usage examples.
v0.10.0
New features
-
The new function
row_count_match()(plusexpect_row_count_match()andtest_row_count_match()) checks for exact matching of rows across two tables (the target table and a comparison table of your choosing). Works equally well for local tables and for database and Spark tables. -
The new
tbl_match()function (along withexpect_tbl_match()andtest_tbl_match()) checks for an exact matching of the target table with a comparison table. It will check for a strict match on table schemas, on equivalent row counts, and then exact matches on cell values across the two tables.
Minor improvements and bug fixes
-
The
set_tbl()function was given thetbl_nameandlabelarguments to provide an opportunity to set metadata on the new target table. -
Support for
mssqltables has been restored and works exceedingly well for the majority of validation functions (the few that are incompatible provide messaging about not being supported).
Documentation
-
All functions in the package now have usage examples.
-
An RStudio Cloud project has been prepared with .Rmd files that contain explainers and runnable examples for each function in the package. Look at the project README for a link to the project.
Breaking changes
-
The
read_fnargument increate_agent()andcreate_informant()has been deprecated in favor of an enhancedtblargument. Now, we can supply a variety of inputs totblfor associating a target table to an agent or an informant. Withtbl, it's now possible to provide a table (e.g.,data.frame,tbl_df,tbl_dbi,tbl_spark, etc.), an expression (a table-prep formula or a function) to read in the table only at interrogation time, or a table source expression to get table preparations from a table store (as an in-memory object or as defined in a YAML file). -
The
set_read_fn(),remove_read_fn(), andremove_tbl()functions were removed since theread_fnargument has been deprecated (and there's virtually no need to remove a table from an object withremove_tbl()now).