Skip to content

fix: NULL in mutate() may cause error and incorrect columns order#356

Open
Yousa-Mirage wants to merge 11 commits into
etiennebacher:mainfrom
Yousa-Mirage:fix/mutate-null-355
Open

fix: NULL in mutate() may cause error and incorrect columns order#356
Yousa-Mirage wants to merge 11 commits into
etiennebacher:mainfrom
Yousa-Mirage:fix/mutate-null-355

Conversation

@Yousa-Mirage
Copy link
Copy Markdown
Contributor

fix #355

Copy link
Copy Markdown
Owner

@etiennebacher etiennebacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to review the code in itself because it becomes really complicated to understand, but I already have a few comments to address below.

Comment thread tests/testthat/test-mutate.R
Comment on lines +213 to +226
expect_equal(
test_pl |>
mutate(
Species = Sepal.Width + 1,
foo = mean(Sepal.Length),
.by = Species,
),
test_df |>
mutate(
Species = Sepal.Width + 1,
foo = mean(Sepal.Length),
.by = Species,
)
)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this test add compared to the existing tests?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a regression prevention test that tests the behavior when the grouping variable is modified. You can delete it if you feel it is not necessary. The test in that comment below was really unnecessary and I've deleted it.

Comment thread tests/testthat/test-mutate.R Outdated
Comment on lines +286 to +295
expect_equal(
test_pl |>
group_by(Species) |>
mutate(Species = Sepal.Width + 1, foo = mean(Sepal.Length)) |>
ungroup(),
test_df |>
group_by(Species) |>
mutate(Species = Sepal.Width + 1, foo = mean(Sepal.Length)) |>
ungroup()
)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Comment thread NEWS.md Outdated
@Yousa-Mirage
Copy link
Copy Markdown
Contributor Author

Yousa-Mirage commented May 17, 2026

I agree with you, expecially for the group part. I spent time to understand the code generated by Codex. It uses a temp col to record the grouping columns. Maybe there are better methods to aovid the modify from affecting grouping.

Yousa-Mirage and others added 4 commits May 17, 2026 20:30
Co-authored-by: Etienne Bacher <52219252+etiennebacher@users.noreply.github.com>
Co-authored-by: Etienne Bacher <52219252+etiennebacher@users.noreply.github.com>
@Yousa-Mirage
Copy link
Copy Markdown
Contributor Author

Yousa-Mirage commented May 17, 2026

Well, the test failure just now was becase, names(.data) triggers the collect_schema() of LazyFrame. I switched to maintaining a list of column names in R to avoid this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NULL in mutate() may be not consistent with dplyr

2 participants