Skip to content

fix: noxfile.py & better align it against GitHub Actions CI #2358

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

camriddell
Copy link
Member

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

  1. fixes noxfile.py as it had improper quoting of arguments that are passed to a subprocess leading to errors.
  2. aligns the contents noxfile.py against the current CI in GitHub Actions .github/workflows/*
    • pytest for coverage tests
    • minimum/pretty_old/not_so_old
    • nightly (isn't a perfect replication due to our current Polars nightly build process)
    • random (slightly refactored utils/generate_random_versions.py)
      • Probably not incredibly useful for local work
      • side noteβ€”this should probably be seed dependent for reproducibility or print out its installed requirements within nox.

@MarcoGorelli I wasn't able to replicate your nightly release for Polars (which is fine) just wanted to ensure that was explicitly pointed out during review.

Copy link
Collaborator

@EdAbati EdAbati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this!

It looks good to me, I just let a couple of questions below.

Unfortunately on my Mac when I run nox tests/series_only/hist_test.py fails:

FAILED tests/series_only/hist_test.py::test_hist_count_hypothesis[pandas[pyarrow]] - ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)
FAILED tests/series_only/hist_test.py::test_hist_count_hypothesis[pandas[nullable]] - ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)
FAILED tests/series_only/hist_test.py::test_hist_count_hypothesis[pyarrow] - AssertionError: Mismatch at index 0: None != 1.0

It doesn't happen if I simply run pytest in the virtual env. (all library versions look the same)
Did you see something similar?


PYTHON_VERSIONS = ["3.8", "3.9", "3.10", "3.11", "3.12"]
PYTHON_VERSIONS = {
"pytest": ["3.8", "3.10", "3.11", "3.12", "3.13"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this too. In CI we test on python 3.9 only in the random test.

Is it something deliberate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked back to this commit: e718d1f

My guess is that it mainly reduces the time it takes for tests to run without sacrificing too much validity.
Though I'll kick this one to @MarcoGorelli as well if he remembers.

noxfile.py Outdated
Comment on lines 62 to 64
with NamedTemporaryFile() as f:
f.write(b"setuptools<78\n")
session.install("-b", f.name, "-e", ".[pyspark]")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why do we need to create a file for setuptools?

Maybe we could add the reason in a comment too?

Copy link
Member Author

@camriddell camriddell Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was (naively) moved over from the current CI as I didn't think one could pipe to stdin within a subprocess like this. I also don't think nox provides a stdin channel to write to- though I would love to be mistaken on that front:

- name: install pyspark
run: echo "setuptools<78" | uv pip install -b - -e ".[pyspark]" --system
# PySpark is not yet available on Python3.12+
if: matrix.python-version != '3.13'

Perhaps @MarcoGorelli remembers something here?

Comment on lines -67 to -70
content = content.replace(
'filterwarnings = [\n "error",\n]',
"filterwarnings = [\n \"error\",\n 'ignore:distutils Version classes are deprecated:DeprecationWarning',\n]",
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this not needed anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thank you! I was going to leave a comment for @MarcoGorelli about this.

The matching string doesn't exist in the pyproject.toml file any more (note the position of the closing square bracket ]) so this doesn't replace anything.

narwhals/pyproject.toml

Lines 204 to 227 in 4144497

[tool.pytest.ini_options]
norecursedirs = ['*.egg', '.*', '_darcs', 'build', 'CVS', 'dist', 'node_modules', 'venv', '{arch}', 'narwhals/_*']
testpaths = ["tests"]
filterwarnings = [
"error",
'ignore:.*defaulting to pandas implementation',
'ignore:.*implementation has mismatches with pandas',
'ignore:.*You are using pyarrow version',
# This warning was temporarily raised by pandas but then reverted.
'ignore:.*Passing a BlockManager to DataFrame:DeprecationWarning',
# This warning was temporarily raised by Polars but then reverted.
'ignore:.*The default coalesce behavior of left join will change:DeprecationWarning',
'ignore: unclosed <socket.socket',
'ignore:.*The distutils package is deprecated and slated for removal in Python 3.12:DeprecationWarning:pyspark',
'ignore:.*distutils Version classes are deprecated. Use packaging.version instead.*:DeprecationWarning:pyspark',
'ignore:.*is_datetime64tz_dtype is deprecated and will be removed in a future version.*:DeprecationWarning:pyspark',
# Warning raised by PyArrow nightly just by importing pandas
'ignore:.*Python binding for RankQuantileOptions not exposed:RuntimeWarning:pyarrow',
'ignore:.*pandas only supports SQLAlchemy:UserWarning:sqlframe',
'ignore:.*numpy.core is deprecated and has been renamed to numpy._core.*:DeprecationWarning:sqlframe',
"ignore:.*__array__ implementation doesn't accept a copy keyword, so passing copy=False failed:DeprecationWarning:modin",
# raised internally by pandas
"ignore:.*np.find_common_type is deprecated:DeprecationWarning:pandas",
]

@MarcoGorelli
Copy link
Member

thanks for your PR!

tbh i think local development comes with so many personal preferences that i'm half-inclined to not have a nox file at all and leave people free to set up their own local testing config how they wish. then we don't need to maintain it

@dangotbanned
Copy link
Member

dangotbanned commented Apr 8, 2025

#2358 (comment)

Feeling the same mostly πŸ€”

then we don't need to maintain it

I think this PR could be more compelling if we were to define conflicting dependency groups in our pyproject.toml.

For example, if minimum/minimum_versions were defined in one place - then having a noxfile.py would have a lower maintenance burden.

@nox.session(python=PYTHON_VERSIONS["minimum"])
def minimum(session: Session) -> None:
    session.install(
        "pandas==0.25.3",
        "polars==0.20.3",
        "numpy==1.17.5",
        "pyarrow==11.0.0",
        "pyarrow-stubs<17",
        "scipy==1.5.0",
        "scikit-learn==1.1.0",
        "duckdb==1.0",
        "tzdata",
        "backports.zoneinfo",
    )
    session.install("-e", ".", "--group", "tests")

Even this would be simpler, since it could just target some combination of groups

- name: install-minimum-versions
run: uv pip install pipdeptree tox virtualenv setuptools pandas==0.25.3 polars==0.20.3 numpy==1.17.5 pyarrow==11.0.0 "pyarrow-stubs<17" scipy==1.5.0 scikit-learn==1.1.0 duckdb==1.0 tzdata backports.zoneinfo --system
- name: install-reqs
run: |
uv pip install -e . --group tests --system
- name: show-deps
run: uv pip freeze
- name: Assert dependencies
run: |
DEPS=$(uv pip freeze)
echo "$DEPS" | grep 'pandas==0.25.3'
echo "$DEPS" | grep 'polars==0.20.3'
echo "$DEPS" | grep 'numpy==1.17.5'
echo "$DEPS" | grep 'pyarrow==11.0.0'
echo "$DEPS" | grep 'scipy==1.5.0'
echo "$DEPS" | grep 'scikit-learn==1.1.0'
echo "$DEPS" | grep 'duckdb==1.0'

@camriddell
Copy link
Member Author

#2358 (comment)

Feeling the same mostly πŸ€”

then we don't need to maintain it

I think this PR could be more compelling if we were to define conflicting dependency groups in our pyproject.toml.

For example, if minimum/minimum_versions were defined in one place - then having a noxfile.py would have a lower maintenance burden.

@nox.session(python=PYTHON_VERSIONS["minimum"])
def minimum(session: Session) -> None:
    session.install(
        "pandas==0.25.3",
        "polars==0.20.3",
        "numpy==1.17.5",
        "pyarrow==11.0.0",
        "pyarrow-stubs<17",
        "scipy==1.5.0",
        "scikit-learn==1.1.0",
        "duckdb==1.0",
        "tzdata",
        "backports.zoneinfo",
    )
    session.install("-e", ".", "--group", "tests")

Even this would be simpler, since it could just target some combination of groups

- name: install-minimum-versions
run: uv pip install pipdeptree tox virtualenv setuptools pandas==0.25.3 polars==0.20.3 numpy==1.17.5 pyarrow==11.0.0 "pyarrow-stubs<17" scipy==1.5.0 scikit-learn==1.1.0 duckdb==1.0 tzdata backports.zoneinfo --system
- name: install-reqs
run: |
uv pip install -e . --group tests --system
- name: show-deps
run: uv pip freeze
- name: Assert dependencies
run: |
DEPS=$(uv pip freeze)
echo "$DEPS" | grep 'pandas==0.25.3'
echo "$DEPS" | grep 'polars==0.20.3'
echo "$DEPS" | grep 'numpy==1.17.5'
echo "$DEPS" | grep 'pyarrow==11.0.0'
echo "$DEPS" | grep 'scipy==1.5.0'
echo "$DEPS" | grep 'scikit-learn==1.1.0'
echo "$DEPS" | grep 'duckdb==1.0'

Wholeheartedly agree here. I wanted to get something up quickly since noxfile.py was fully broken in its current state.

I'd like to take that next step of merging the metadata options (dependency groups, coverage values, etc.) into the pyproject.toml and having both nox and GH CI use that as a single-source of truth.

That said, if @MarcoGorelli is inclined to remove noxfile.py the perhaps we should start there and I'll hold onto this for personal use.

@camriddell
Copy link
Member Author

camriddell commented Apr 8, 2025

Thank you for doing this!

It looks good to me, I just let a couple of questions below.

Unfortunately on my Mac when I run nox tests/series_only/hist_test.py fails:

FAILED tests/series_only/hist_test.py::test_hist_count_hypothesis[pandas[pyarrow]] - ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)
FAILED tests/series_only/hist_test.py::test_hist_count_hypothesis[pandas[nullable]] - ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)
FAILED tests/series_only/hist_test.py::test_hist_count_hypothesis[pyarrow] - AssertionError: Mismatch at index 0: None != 1.0

It doesn't happen if I simply run pytest in the virtual env. (all library versions look the same) Did you see something similar?

I had nox tests pass locally on my endβ€”so maybe there is some flakiness in the hypothesis/hist tests that hasn't been picked up by existing CI? I'll try to replicate on my machine- but could you also grab the seed that hypothesis uses when you call it from nox?

I'm going to dive back into the hist implementations anyways to factor in the new hist behavior in Polars so I'll take a look at this failure then as well.

Was there a specific test group causes the failure here for you (ie. one of the pytest_coverage, minimum/old/not_so_old/nightly)? Or was this present in each in all of them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants