Skip to content

GH-39138: [R] Fix implicit conversion warnings #39250

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Dec 22, 2023

Conversation

paleolimbot
Copy link
Member

@paleolimbot paleolimbot commented Dec 16, 2023

Rationale for this change

We have failing CRAN checks because this warning occurs on one check machine.

What changes are included in this PR?

Implicit integer casts are made explicit and/or variable declarations were fixed so that fewer implicit integer casts were performed. Fully solving the warnings also requires r-lib/cpp11#349 since some errors occur in those headers.

Are these changes tested?

This particular test we can't do on CI because the MacOS runner we have doesn't have a new enough clang to support the requisite -W flags. I tested this locally by adding PKG_CXXFLAGS=-Wconversion -Wno-sign-conversion -Wsign-compare -Werror to Makevars.in.

Are there any user-facing changes?

No

Copy link

⚠️ GitHub issue #39138 has been automatically assigned in GitHub to PR creator.

@paleolimbot
Copy link
Member Author

@github-actions crossbow submit test-r-install-local

Copy link

Revision: 814e712

Submitted crossbow builds: ursacomputing/crossbow @ actions-ba1f272e1d

Task Status
test-r-install-local GitHub Actions

@paleolimbot
Copy link
Member Author

@github-actions crossbow submit test-r-install-local

Copy link

Revision: 97d3713

Submitted crossbow builds: ursacomputing/crossbow @ actions-523036ada2

Task Status
test-r-install-local GitHub Actions

@paleolimbot
Copy link
Member Author

@github-actions crossbow submit --group r

Copy link

Revision: d3b8acc

Submitted crossbow builds: ursacomputing/crossbow @ actions-bfb08e2c71

Task Status
conda-linux-aarch64-cpu-r42 Azure
conda-linux-aarch64-cpu-r43 Azure
conda-linux-x64-cpu-r42 Azure
conda-linux-x64-cpu-r43 Azure
conda-osx-arm64-cpu-r42 Azure
conda-osx-arm64-cpu-r43 Azure
conda-osx-x64-cpu-r42 Azure
conda-osx-x64-cpu-r43 Azure
conda-win-x64-cpu-r41 Azure
r-binary-packages GitHub Actions
test-fedora-r-clang-sanitizer Azure
test-r-arrow-backwards-compatibility GitHub Actions
test-r-depsource-bundled Azure
test-r-depsource-system GitHub Actions
test-r-dev-duckdb GitHub Actions
test-r-devdocs GitHub Actions
test-r-gcc-11 GitHub Actions
test-r-gcc-12 GitHub Actions
test-r-install-local GitHub Actions
test-r-install-local-minsizerel GitHub Actions
test-r-library-r-base-latest Azure
test-r-linux-as-cran GitHub Actions
test-r-linux-rchk GitHub Actions
test-r-linux-valgrind Azure
test-r-minimal-build Azure
test-r-offline-maximal GitHub Actions
test-r-offline-minimal Azure
test-r-rhub-debian-gcc-devel-lto-latest Azure
test-r-rhub-debian-gcc-release-custom-ccache Azure
test-r-rhub-ubuntu-gcc-release-latest Azure
test-r-rstudio-r-base-4.1-opensuse153 Azure
test-r-rstudio-r-base-4.2-centos7-devtoolset-8 Azure
test-r-rstudio-r-base-4.2-focal Azure
test-r-ubuntu-22.04 GitHub Actions
test-r-versions GitHub Actions
test-ubuntu-r-sanitizer Azure

@assignUser
Copy link
Member

@ursabot please benchmark

@ursabot
Copy link

ursabot commented Dec 19, 2023

Benchmark runs are scheduled for commit d3b8acc. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete.

@assignUser
Copy link
Member

The benchmark that is through seems to show some regressions? But I also don't have practice in interpreting these: https://conbench.ursa.dev/compare/runs/e13bb0a5533349c094f88b39316be8e3...0d58238a60974b3facaf13de17fad7f7/

Copy link

Thanks for your patience. Conbench analyzed the 6 benchmarking runs that have been run so far on PR commit d3b8acc.

There was 1 benchmark result indicating a performance regression:

The full Conbench report has more details.

@paleolimbot
Copy link
Member Author

I don't see any R-related regressions? (Even if there were, it would have been because a check was added that increases safety, although I don't think I added any additional out-of-bounds checks in places with tight loops).

@danepitkin
Copy link
Member

Overall LGTM! (Caveat: I haven't reviewed much C++ code recently)

@danepitkin
Copy link
Member

There is a lot of downcasting taking place in Arrow R. Any idea if this was an intentional decision due to some 32bit limitation?

@paleolimbot
Copy link
Member Author

There is a lot of downcasting taking place in Arrow R

By default, I think you would be hard pressed to find loss of precision happening. The main place that this could happen without explicit user intervention is (1) when converting timestamps to R with subsecond precision (limitation of R, we store timestamps as seconds with double precision) and (2) factors with more then INT_MAX elements in a dictionary (limitation of R, factors are int under the hood). Other downcasting that happens requires intervention from the user (i.e., Array$create(1.2345, int32()) will truncate 1.234).

I think we can do better in all cases, but we still wouldn't do that checking or erroring at the element level (i.e., where the static_cast<>() happens)...we'd want to do something like if (AnyConversionMayBeLossy(<big long thing>)) return Status::Invalid(...) before we get to that point.

@assignUser
Copy link
Member

I don't see any R-related regressions?

Sorry I forgot which PR this is 🤦‍♂️

r/src/altrep.cpp Outdated
@@ -613,11 +615,14 @@ struct AltrepFactor : public AltrepVectorBase<AltrepFactor> {
case Type::INT32:
return indices->data()->GetValues<int32_t>(1)[j] + 1;
case Type::UINT32:
return indices->data()->GetValues<uint32_t>(1)[j] + 1;
// TODO: check index?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should consider changing the return type of this function to int64_t instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you can postpone all the refactoring by renaming this function and adding a wrapper with this signature with the sole purpose of converting the int64_t returned by this modified function to int. Then you have to write the casts and check in a single place.

r/src/altrep.cpp Outdated
@@ -718,7 +723,8 @@ struct AltrepFactor : public AltrepVectorBase<AltrepFactor> {

VisitArraySpanInline<Type>(
*array->data(),
/*valid_func=*/[&](index_type index) { *out++ = transpose(index) + 1; },
/*valid_func=*/
[&](index_type index) { *out++ = static_cast<int>(transpose(index) + 1); },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could out be changed to be an int64_t?

@@ -802,7 +808,8 @@ struct AltrepVectorString : public AltrepVectorBase<AltrepVectorString<Type>> {
}

nul_was_stripped_ = true;
return Rf_mkCharLenCE(stripped_string_.data(), stripped_len, CE_UTF8);
return Rf_mkCharLenCE(stripped_string_.data(), static_cast<int>(stripped_len),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, a DCHECK_LE(stripped_len, std::numeric_limits<int>::max()) would be useful since there is no way to change what Rf_mkCharLenCE receives.

@paleolimbot
Copy link
Member Author

Thank you @felipecrv and @danepitkin for the feedback (and sorry to everybody for taking a few days to circle back here...I wanted to make sure I addressed the comments properly!).

I added a ARROW_R_DCHECK() guarded by a special ARROW_R_DEBUG...we don't have any precedent in the package so far for using the existing DCHECK macros and I'm a little worried it will add cerr or abort symbols. I'm probably the only person that will have it enabled but at least it gives us a place to put some of these checks.

@paleolimbot paleolimbot merged commit 7b71156 into apache:main Dec 22, 2023
@paleolimbot paleolimbot removed the awaiting committer review Awaiting committer review label Dec 22, 2023
assignUser pushed a commit that referenced this pull request Dec 23, 2023
### Rationale for this change

We have failing CRAN checks because this warning occurs on one check machine.

### What changes are included in this PR?

Implicit integer casts are made explicit and/or variable declarations were fixed so that fewer implicit integer casts were performed. Fully solving the warnings also requires r-lib/cpp11#349 since some errors occur in those headers.

### Are these changes tested?

This particular test we can't do on CI because the MacOS runner we have doesn't have a new enough `clang` to support the requisite `-W` flags. I tested this locally by adding `PKG_CXXFLAGS=-Wconversion -Wno-sign-conversion -Wsign-compare -Werror` to `Makevars.in`.

### Are there any user-facing changes?

No
* Closes: #39138

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 7b71156.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 10 possible false positives for unstable benchmarks that are known to sometimes produce them.

clayburn pushed a commit to clayburn/arrow that referenced this pull request Jan 23, 2024
### Rationale for this change

We have failing CRAN checks because this warning occurs on one check machine.

### What changes are included in this PR?

Implicit integer casts are made explicit and/or variable declarations were fixed so that fewer implicit integer casts were performed. Fully solving the warnings also requires r-lib/cpp11#349 since some errors occur in those headers.

### Are these changes tested?

This particular test we can't do on CI because the MacOS runner we have doesn't have a new enough `clang` to support the requisite `-W` flags. I tested this locally by adding `PKG_CXXFLAGS=-Wconversion -Wno-sign-conversion -Wsign-compare -Werror` to `Makevars.in`.

### Are there any user-facing changes?

No
* Closes: apache#39138

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
### Rationale for this change

We have failing CRAN checks because this warning occurs on one check machine.

### What changes are included in this PR?

Implicit integer casts are made explicit and/or variable declarations were fixed so that fewer implicit integer casts were performed. Fully solving the warnings also requires r-lib/cpp11#349 since some errors occur in those headers.

### Are these changes tested?

This particular test we can't do on CI because the MacOS runner we have doesn't have a new enough `clang` to support the requisite `-W` flags. I tested this locally by adding `PKG_CXXFLAGS=-Wconversion -Wno-sign-conversion -Wsign-compare -Werror` to `Makevars.in`.

### Are there any user-facing changes?

No
* Closes: apache#39138

Authored-by: Dewey Dunnington <[email protected]>
Signed-off-by: Dewey Dunnington <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[R] Compile with -Wconversion on clang15 results in compiler warnings
5 participants