-
Notifications
You must be signed in to change notification settings - Fork 57
Reorganize Tests against R #906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…st stops complaining
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
Hi @shapiromh, thanks so much for this! Looks good at first sight but I have to / will spend some more time on this tomorrow morning. Re your comments:
Yes, some of the docs do - the "fixest vs pyfixest" vignette depends on the R core dependencies. So we'd have to add all of these to the docs deps.
Yeah, I think it was an util I set up because R deps needed to be installed in the global R env and this should ensure that python could talk to the global R env? Though not 100% sure anymore. Let's see if we can make things run without (I would think so) and then I'd be happy to delete the function from the code base.
Hm, I think for the ivDiag tests, could we simply run them once, store results in a csv file, and then drop it as a dependency? I think this would be the general strategy for all other non-conda packages as well - we could provide a script that calls R and produces a csv with "R results", which we store and test against. This way, all results would be reproducible (though not perfectly) and users / testers would not have to install ivDiag, fwildclusterboot, wildrwolf, ritest etc? This should then also solve the issue you mention in your last point?
|
Sorry, I was asking if non core R dependencies (those not in conda) are used in the docs. I did up the docs environment to install R in the toml, but I probably messed up the github actions if any changes around that were needed.
ivDiag is the only one that caused me problems, but generally this strategy makes sense to me. I saw this was the approach already taken with some of the tests (and why some of the test codes never went down paths that called the R packages). I would think the only argument against is if any of these non-conda packages are under active development with known bugs , but then you or other core maintainers should probably hold the single source of truth on what other contributors should be matching. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, except for two small comments - one data set is not available at runtime in the CI & the car package can be installed from conda directly. Though, maybe it is cleaner to ask users to download it manually as it is not part of the "core" R test dependencies?
@@ -17,6 +16,7 @@ def data(local=False): | |||
|
|||
|
|||
# function retrieved from Harvard Dataverse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is clear from the description in the pyproject toml, but the extended tests are generally turned off as they run for way too long (some of them more than 10 minutes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this was pulled because of a warning from pytest that marks on fixtures do nothing. If I didn't move the mark to the functions that may call "data", I should have.
r_test_requirements.R
Outdated
install.packages( | ||
c('fixest', 'broom','clubSandwich', 'did2s', 'wildrwolf', 'reticulate', 'ivDiag', 'stats', 'base', 'car'), | ||
c('did2s', 'reticulate', 'ivDiag', 'car'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also be able to install r-car via conda-forge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is my miss. We may as well include this as a "core r" package too then, no?
@@ -33,6 +34,8 @@ def data(): | |||
return df_het | |||
|
|||
|
|||
@pytest.mark.skipif(import_check is False, reason="R package did2s not installed.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an excellent solution, big fan =)
@@ -355,6 +361,7 @@ def test_fully_interacted(unit, cluster): | |||
) | |||
|
|||
|
|||
@pytest.mark.against_r_core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might fail in the CI as I don't think the data is available on github? It's from the did R repo: https://github.com/bcallaway11/did/tree/master/data When I wrote the test, I did not want to think about how to load dta files into pandas, and I did not simply want to copy over the file as I wasn't sure about the license. did
uses GPL-2 which I think (?) does not allow reuse under any other license then GPL-2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could add a check on the data and skip if decorator here as well?
The CI currently fails because we have to rebuild the pixi lock file after updating the pyproject toml. Will do this, one moment =) |
Now let's see if the CI runs - I recall that I struggled in the past: I've been installing r and packages from CRAN via r2u, which provides ubuntu binaries of R packages (so no need for manual installation!). But I recall that the conda env and the R installation provided by r2u had problems talking to each other; that's why I eventually settled for the solution to "install all R depcs via r2u and none via conda". I think that all "r core" tests will run, as all required deps can be found in the conda environment, but would assume that all r-extended tests will not find the dependencies installed into the global env (via r2u) and therefore be skipped. The solution here would be to ask r2u to install into the conda env, and I recall that I tried and failed 😅 |
Codecov ReportAttention: Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 16 files with indirect coverage changes 🚀 New features to boost your workflow:
|
@all-contributors please add @shapiromh for code |
I've put up a pull request to add @shapiromh! 🎉 |
Sorry, I got it wrong then! But I think no changes are needed, only fixest and broom need to be available (and we could drop this requirement as well).
Yes, I agree - imo there are two arguments for running code via rpy2 - the first is to make sure that any bugs fixed via dependencies are eventually caught; I'd also want to know in case For the tests where we currently follow this strategy, I think the main reason was time - checking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are some changes here I could make and resubmit:
- (Maybe) Add a check on whether the data is available and skip the test if not
- Add cars as a core package
- Add back "extended" test marks I may have inadvertently removed.
I'm not sure what the conclusion was on the failing CI because of R...
@@ -17,6 +16,7 @@ def data(local=False): | |||
|
|||
|
|||
# function retrieved from Harvard Dataverse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this was pulled because of a warning from pytest that marks on fixtures do nothing. If I didn't move the mark to the functions that may call "data", I should have.
r_test_requirements.R
Outdated
install.packages( | ||
c('fixest', 'broom','clubSandwich', 'did2s', 'wildrwolf', 'reticulate', 'ivDiag', 'stats', 'base', 'car'), | ||
c('did2s', 'reticulate', 'ivDiag', 'car'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is my miss. We may as well include this as a "core r" package too then, no?
@@ -355,6 +361,7 @@ def test_fully_interacted(unit, cluster): | |||
) | |||
|
|||
|
|||
@pytest.mark.against_r_core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could add a check on the data and skip if decorator here as well?
All makes sense! If you address all three we can merge. On the CI- I think it's just a CI failure, these sometimes happen. I think it will go away by itself. |
On the 3:
|
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
Looks very good and I can merge now. Thanks for your first PR to pyfixest @shapiromh! |
* Update project toml with conda-forge available R packages * Added new pytest marks for r_against_core and r_against_extended * Updated packages to install for extended R environment * Added test markers in pytest init and related pixi dev tasks * Moved the extended mark of a fixture to the relevant function so pytest stops complaining * Adjusted extended R test scripts to skip over modules not properly installed * Updated R requirements to correct install issues. * Added skip summary on tasks that may cover R tests * Updated the documentation around changes to R tests. * Added R as dependency to docs as well to avoid need for global install * UNTESTED: Updated git workflow actions to reflect new R install? * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * pixi lock * Made changes to make car a core R package * Added check on mpdata availability * Fix: forgot to label car tests as core instead of extended * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Matthew Shapiro <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Alexander Fischer <[email protected]>
This is a PR related to #828.
Summary of Code Changes:
Some Notes:
utils/set_rpy2_path.py
could be deprecated. The updates actually remove all references to this code, but then there is no built in way to extend the check R module libraries beyond the project install.