Skip to content

Releases: rstudio/pointblank

v0.12.3

28 Nov 19:33
3690ac2

Choose a tag to compare

  • We now support validation checks of Oracle tables via ODBC (#462). (#644, @pachadotdev)

  • Redshift table support has now been fixed (#538). (#623, #643, @pachadotdev and @hfrick)

  • Added the na_pass argument to col_vals_expr() for finer control of NA values. Previously, NAs were ignored, but now they are caught as failures with the default na_pass = FALSE. As a safeguard, if an expression generates NA values while na_pass is not explicitly supplied, a warning is thrown (#616). (#617)

  • Added an infix resolution for log4r_step()'s message argument (#654). (#656, @alexpaynter)

  • Fixed the issue where an agent would auto-generating a table label that was too long (truncation occurs now) (#613). (#614)

  • Fixed problem where agents would not search the formula environment when materializing ~ tbl (#598). (#599)

  • info_columns() now warns more informatively when no columns are selected (#344). (#589).

  • write_yaml() errors more informatively now when a tbl value is incompatible for YAML-writing (#596). (#597)

  • The yaml_agent_string() function now returns the yaml string (#609). (#610)

  • The col_vals_regex() function now supports Perl regexes (#606). (#608)

  • We now use a safeguard to fall back to the original string as-is if glue interpolation fails (useful in cases where a user-provided regex like {1,2} is to be preservedin in autobrief but triggers an error by {glue}) (#600). (#601)

  • Data extracts for rows_distinct()/rows_complete() now preserve all columns, not just the ones tested (#475). (#588, #591)

  • The brief argument of validation functions now also supports {glue} syntax. (#587)

  • Validation step briefs correctly recycle to match expanded steps. (#564)

  • Performed extensive refactoring of internal dplyr and ggplot2 code. (#579, #582, #583, @olivroy)

  • Rebuilt translation table and added support for ordered factors in scan_data(). (#580, @olivroy)

  • Modernized test infrastructure and removed deprecated testthat features. (#577, @olivroy)

  • Added a new Get Started vignette. (#605, @hfrick)

  • The README.md has undergone some refinements. (#618)

v0.12.2

23 Oct 13:58
a37333e

Choose a tag to compare

This release provides a few minor improvements along with many bug fixes.

  • New argument extract_tbl_checked added to interrogate(). When FALSE, the $tbl_checked column from the validation set will be dropped before returning the agent. This may be helpful in reducing object size for large agents (#542). (#554)

  • The new argument na_rm in snip_list() suppresses any NA values so that they won't included in the snippet's list of items (#547). (#556)

  • Improved readability of error messages rendered as tooltips in the agent report. (#543)

  • col_vals_expr() shows used columns in the agent report when interrogated. (#570)

  • Improved the matching of rows between agent$validation_step and the rows of the agent report (#563). (#565)

  • Functions accepting ... now use rlang::list2(), enabling dynamic dots. For example, a multiagent can now be constructed from a list() of agents using create_multiagent(!!!list_of_agents) (#552). (#553)

  • Fixed bug with non-standard column names in some validation functions (#545, #546). (#555)

  • Fixed a regression in col_vals_*() functions, where vars("col") was evaluating to the string "col". Behavior of vars("col") is now aligned back with vars(col) - both evaluate to the column name col. (#535)

  • Problems arising from comparing columns to a value of different class (for example, comparing a datetime column to a date value Sys.Date() instead of another datetime value Sys.time()) are now signalled appropriately at interrogate() (#536, #537). (#539)

  • Fixed bug in has_columns() failing to detect non-existing columns when supplied as a character vector. (#540)

  • Replace uses of crayon::make_style() with cli::make_ansi_style(), removing the crayon dependency. (#559, thanks @olivroy!)

  • Use rlang::check_installed() to perform checks of optional package installs. (#559, @olivroy)

  • Modernized CI workflows with dedicated linting action. (#560, @olivroy)

  • Avoid unwanted equation formatting in agent report arising from arbitrary "$" characters (#561). (#562)

v0.12.1

25 Mar 19:26
46c9ff5

Choose a tag to compare

  • Ensured that the column string is a symbol before constructing the expression for the col_vals_*() functions.

  • No longer resolve columns with tidyselect when the target table cannot be materialized.

  • Relaxed tests on tidyselect error messages.

v0.12.0

01 Mar 13:26
e131437

Choose a tag to compare

New features

  • Complete {tidyselect} support for the columns argument of all validation functions, as well as in has_columns() and info_columns. The columns argument can now take familiar column-selection expressions as one would use inside dplyr::select(). This also begins a process of deprecation:

    • columns = vars(...) will continue to work, but c() now supersedes vars().
    • If passing an external vector of column names, it should be wrapped in all_of().
  • The label argument of validation functions now exposes the following string variables via {glue} syntax:

    • "{.step}": The validation step name
    • "{.col}": The current column name
    • "{.seg_col}": The current segment's column name
    • "{.seg_val}": The current segment's value/group

    These dynamic values may be useful for validations that get expanded into multiple steps.

  • interrogate() gains two new options for printing progress in the console output:

    • progress: Whether interrogation progress should be printed to the console (TRUE for interactive sessions, same as before)
    • show_step_label: Whether each validation step's label value should be printed alongside the progress.

Minor improvements and bug fixes

  • Fixes issue with rendering reports in Quarto HTML documents.

  • When no columns are returned from a {tidyselect} expression in columns, the agent's report now displays the originally supplied expression instead of simply blank (e.g., in create_agent(small_table) |> col_vals_null(matches("z"))).

  • Fixes issue with the hashing implementation to improve performance and alignment of validation steps in the multiagent.

v0.11.4

25 Apr 15:15
47834e5

Choose a tag to compare

  • Fixes issue with gt 0.9.0 compatibility.

v0.11.3

09 Feb 21:56
5e3e60a

Choose a tag to compare

  • Fixes issue with tables not rendering due to interaction with the gt package.

v0.11.2

09 Oct 17:00
6d328d3

Choose a tag to compare

  • Internal changes were made to ensure compatibility with an in-development version of R.

v0.11.1

06 Sep 15:40
d87d55b

Choose a tag to compare

  • Updated all help files to pass HTML validation.

v0.11.0

14 Jul 02:50
b056ce3

Choose a tag to compare

New features

  • The row_count_match() function can now match the count of rows in the target table to a literal value (in addition to comparing row counts to a secondary table).

  • The analogous col_count_match() function was added to compare column counts in the target table to a secondary table, or, to match on a literal value.

  • Substitution syntax has been added to the tbl_store() function with {{ <name> }}. This is a great way to make table-prep more concise, readable, and less prone to errors.

  • The get_informant_report() has been enhanced with more width options. Aside from the "standard" and "small" sizes we can now supply any pixel- or percent-based width to precisely size the reporting.

  • Added support for validating data in BigQuery tables.

Documentation

  • All functions in the package now have better usage examples.

v0.10.0

23 Jan 22:09
4ef8d6b

Choose a tag to compare

New features

  • The new function row_count_match() (plus expect_row_count_match() and test_row_count_match()) checks for exact matching of rows across two tables (the target table and a comparison table of your choosing). Works equally well for local tables and for database and Spark tables.

  • The new tbl_match() function (along with expect_tbl_match() and test_tbl_match()) checks for an exact matching of the target table with a comparison table. It will check for a strict match on table schemas, on equivalent row counts, and then exact matches on cell values across the two tables.

Minor improvements and bug fixes

  • The set_tbl() function was given the tbl_name and label arguments to provide an opportunity to set metadata on the new target table.

  • Support for mssql tables has been restored and works exceedingly well for the majority of validation functions (the few that are incompatible provide messaging about not being supported).

Documentation

  • All functions in the package now have usage examples.

  • An RStudio Cloud project has been prepared with .Rmd files that contain explainers and runnable examples for each function in the package. Look at the project README for a link to the project.

Breaking changes

  • The read_fn argument in create_agent() and create_informant() has been deprecated in favor of an enhanced tbl argument. Now, we can supply a variety of inputs to tbl for associating a target table to an agent or an informant. With tbl, it's now possible to provide a table (e.g., data.frame, tbl_df, tbl_dbi, tbl_spark, etc.), an expression (a table-prep formula or a function) to read in the table only at interrogation time, or a table source expression to get table preparations from a table store (as an in-memory object or as defined in a YAML file).

  • The set_read_fn(), remove_read_fn(), and remove_tbl() functions were removed since the read_fn argument has been deprecated (and there's virtually no need to remove a table from an object with remove_tbl() now).