Skip to content

Migrate to per-operation grouping with .by and add copilot instructions#24

Merged
d-morrison merged 7 commits into
more-plotfrom
copilot/sub-pr-11-again
Dec 23, 2025
Merged

Migrate to per-operation grouping with .by and add copilot instructions#24
d-morrison merged 7 commits into
more-plotfrom
copilot/sub-pr-11-again

Conversation

Copilot AI commented Dec 23, 2025

Copy link
Copy Markdown
Contributor

Addresses review feedback to switch from group_by() + ungroup() pattern to per-operation grouping with .by parameter throughout the codebase, and document this preference.

Changes

  • Fixed bug in update_possible_subj_data.R: group_by("ID") used string literal instead of column reference
  • Converted 6 instances in fit_joint_model.R: Replaced group_by() |> summarize(.groups = "drop") and group_by() |> mutate() |> ungroup() with .by parameter
  • Corrected .by syntax: All .by parameters use tidy selection with quoted strings (.by = c("ID", "Group")) to avoid R CMD check warnings about undefined global variables
  • Removed unused imports: group_by, ungroup, group_by_at no longer needed
  • Created .github/copilot-instructions.md: Documents preference for per-operation grouping with examples and comprehensive explanation of tidy selection vs. data masking
  • Merged with more-plot branch: Resolved conflicts by combining comprehensive project documentation from more-plot with new dplyr grouping conventions

Before/After

# Before
data |>
  dplyr::group_by("ID") |>  # Bug: string literal
  dplyr::mutate(centered = .data$value - mean(.data$value)) |>
  dplyr::ungroup()

# After  
data |>
  dplyr::mutate(
    .by = "ID",  # Tidy selection: quoted string (avoids R CMD check warnings)
    centered = .data$value - mean(.data$value)  # Data masking: .data$ pronoun
  )

Tidy Selection vs. Data Masking

The .by parameter uses tidy selection (like select(), rename()) which supports both bare column names and quoted strings. Quoted strings are preferred to avoid R CMD check warnings about undefined global variables. The .data$ pronoun is for data masking contexts (like mutate(), filter() expressions), not tidy selection. The copilot instructions now include a comprehensive section explaining this distinction.

Conflict Resolution

The .github/copilot-instructions.md file now includes both the comprehensive project documentation from the more-plot branch (project overview, technology stack, development setup, build/test commands) and the new dplyr grouping conventions, creating a unified instruction file for the repository.

References:


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits December 23, 2025 00:31
Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copilot AI changed the title [WIP] Update plot improvements and grouping preferences Migrate to per-operation grouping with .by and add copilot instructions Dec 23, 2025
Copilot AI requested a review from d-morrison December 23, 2025 00:35
@d-morrison

Copy link
Copy Markdown
Owner

@copilot resolve conflicts

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Copilot AI commented Dec 23, 2025

Copy link
Copy Markdown
Contributor Author

@copilot resolve conflicts

Conflicts resolved. Merged the more-plot branch into this PR, combining the comprehensive project documentation with the new dplyr grouping preferences. The .github/copilot-instructions.md file now includes both the full project overview and the per-operation grouping conventions.

Resolved in commit 8e4d860.

@d-morrison

Copy link
Copy Markdown
Owner

@copilot R/update_possible_subj_data.R:62 this .by statement should use tidy selection, not data masking; please add a note in the copilot instructions to differentiate between these two; see https://dplyr.tidyverse.org/articles/programming.html for details

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Copilot AI commented Dec 23, 2025

Copy link
Copy Markdown
Contributor Author

@copilot R/update_possible_subj_data.R:62 this .by statement should use tidy selection, not data masking; please add a note in the copilot instructions to differentiate between these two; see https://...

Fixed. The .by parameter uses tidy selection (bare column names) instead of data masking. Changed all instances from .by = c(.data$ID, .data$S) to .by = c(ID, S) throughout the codebase.

Also added a comprehensive "Tidy Selection vs. Data Masking" section to the copilot instructions explaining:

  • Tidy selection: used in select(), rename(), .by parameter → bare names
  • Data masking: used in mutate(), filter(), expression arguments → .data$ pronoun

Addressed in commit 74fe60f.

@d-morrison

Copy link
Copy Markdown
Owner

@copilot resolve merge conflicts and use quoted strings for tidy-select

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Copilot AI commented Dec 23, 2025

Copy link
Copy Markdown
Contributor Author

@copilot resolve merge conflicts and use quoted strings for tidy-select

Resolved. Merged the more-plot branch and updated all .by parameters to use quoted strings for tidy selection:

Changes:

  • .by = c(ID, S).by = c("ID", "S")
  • .by = ID.by = "ID"
  • .by = Stratum.by = "Stratum"

This avoids R CMD check warnings about undefined global variables while maintaining the correct tidy selection syntax (not data masking).

Updated copilot instructions to recommend quoted strings as the preferred approach for tidy selection in .by parameters.

Addressed in commit 1c35bff.

@d-morrison d-morrison marked this pull request as ready for review December 23, 2025 01:30
@d-morrison d-morrison merged commit 2751db6 into more-plot Dec 23, 2025
8 of 11 checks passed
@codecov

codecov Bot commented Dec 23, 2025

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 11 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
R/fit_joint_model.R 0.00% 10 Missing ⚠️
R/update_possible_subj_data.R 0.00% 1 Missing ⚠️
Files with missing lines Coverage Δ
R/update_possible_subj_data.R 0.00% <0.00%> (ø)
R/fit_joint_model.R 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants