Skip to content

ard_compare draft #437#527

Open
malanbos wants to merge 24 commits intoinsightsengineering:mainfrom
malanbos:main
Open

ard_compare draft #437#527
malanbos wants to merge 24 commits intoinsightsengineering:mainfrom
malanbos:main

Conversation

@malanbos
Copy link
Collaborator

@malanbos malanbos commented Dec 1, 2025

What changes are proposed in this pull request?

New function to compare two ARDs: ard_compare

Fixes#437 , @malanbos

Comes with top-level ard_compare function, as well as a script with ard_compare_helpers.R, and check_environment.R to modularize it.


Pre-review Checklist (if item does not apply, mark is as complete)

  • All GitHub Action workflows pass with a ✅
  • PR branch has pulled the most recent updates from master branch: usethis::pr_merge_main()
  • If a bug was fixed, a unit test was added.
  • Code coverage is suitable for any new functions/features (generally, 100% coverage for new code): devtools::test_coverage()
  • Request a reviewer

Reviewer Checklist (if item does not apply, mark is as complete)

  • If a bug was fixed, a unit test was added.
  • Run pkgdown::build_site(). Check the R console for errors, and review the rendered website.
  • Code coverage is suitable for any new functions/features: devtools::test_coverage()

When the branch is ready to be merged:

  • Update NEWS.md with the changes from this pull request under the heading "# cards (development version)". If there is an issue associated with the pull request, reference it in parentheses at the end update (see NEWS.md for examples).
  • All GitHub Action workflows pass with a ✅
  • Approve Pull Request
  • Merge the PR. Please use "Squash and merge" or "Rebase and merge".

Optional Reverse Dependency Checks:

Install checked with pak::pak("Genentech/checked") or pak::pak("checked")

# Check dev versions of `cardx`, `gtsummary`, and `tfrmt` which are in the `ddsjoberg` R Universe
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = TRUE)); checked::check_rev_deps(path = '.', n = parallel::detectCores() - 2L, repos = c('https://ddsjoberg.r-universe.dev', 'https://cloud.r-project.org'))"

# Check CRAN reverse dependencies but run tests skipped on CRAN
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = TRUE)); checked::check_rev_deps(path = '.', n = parallel::detectCores() - 2, repos = 'https://cloud.r-project.org')"

# Check CRAN reverse dependencies in a CRAN-like environment
Rscript -e "options(checked.check_envvars = c(NOT_CRAN = FALSE), checked.check_build_args = '--as-cran'); checked::check_rev_deps(path = '.', n = parallel::detectCores() - 2, repos = 'https://cloud.r-project.org')"

@malanbos malanbos requested a review from ddsjoberg December 1, 2025 15:24
Copy link
Collaborator

@ddsjoberg ddsjoberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @malanbos for this!

For now, let's skip env handling. I think it's more complex than we need. We can revisit in the future. Let me know if you'd like to discuss

#' (based on key columns)
#' - `rows_in_y_not_x`: data frame of rows present in `y` but not in `x`
#' (based on key columns)
#' - `compare`: a named list where each element is a data frame containing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe would call diff?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean rename the "compare" to "diff"? That's fine with me yes.

#'
#' compare_ard(ard_base, ard_modified)$compare$stat
#'
compare_ard <- function(x,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the construction and concept of the main function. The helper functions are a bit of a Dedalus but I will take a closer look during the week ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! And I learnt a new word here :)

R/compare_ard.R Outdated
Comment on lines +32 to +36
#' @examples
#' ard_base <- ard_summary(ADSL, variables = AGE)
#' ard_modified <- ard_summary(dplyr::mutate(ADSL, AGE = AGE + 1), variables = AGE)
#'
#' compare_ard(ard_base, ard_modified)$compare$stat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I would make the more examples more complex (so to show groups etc)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add some more complex ones.

Comment on lines +217 to +223
# perform inner join to compare only matching rows
comparison <- dplyr::inner_join(
x_selected,
y_selected,
by = keys,
suffix = c(".x", ".y")
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason this does not drop the "card" class producing a contracted output (not showing some columns). Should we keep this behavior or should we drop the card class? maybe defining card_diff new class?

$stat
{cards} data frame: 6 x 4
  variable stat_name
1      AGE      mean
2      AGE    median
3      AGE       p25
4      AGE       p75
5      AGE       min
6      AGE       max
ℹ 2 more variables: stat.x, stat.y

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I accepted the below suggestion

Copy link
Contributor

@Melkiades Melkiades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work!! I really like it, output is very intuitive and efficient in showing differences. I added a minor comment on the final compare output that is hiding columns and I would add more examples but for me the code is almost complete. I was wondering if we could need a +- diff like testthat snapshots as an optional diff output... Dont know, for now it is pretty easy to understand the output.

I still need to test this with more complex differences (e.g. I wonder if you solve the fmt before getting into the comparison or not - we should if not), but I wait to try the additional examples ;)

malanbos and others added 6 commits February 9, 2026 13:51
Co-authored-by: Davide Garolini <dgarolini@gmail.com>
Signed-off-by: Malan <64360731+malanbos@users.noreply.github.com>
Co-authored-by: Davide Garolini <dgarolini@gmail.com>
Signed-off-by: Malan <64360731+malanbos@users.noreply.github.com>
Co-authored-by: Davide Garolini <dgarolini@gmail.com>
Signed-off-by: Malan <64360731+malanbos@users.noreply.github.com>
Comment on lines +466 to +467
matches <- mapply(
identical,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed -> all.equals

malanbos and others added 6 commits February 23, 2026 15:37
…ibutes

Replaces identical() with all.equal() for element-wise value comparison,
exposing tolerance and check.attributes parameters. Mismatch data frames
now include a difference column with all.equal() descriptions. Example
updated to use by = ARM grouping.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@malanbos
Copy link
Collaborator Author

As discussed 9Feb, made two updates:

  1. replace "identical" with "all.equal". Also added parameters to compare_ard:

    • tolerance
    • check.attributes (default = TRUE)
  2. add more complex example (with grouping by treatment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants