Implementing Null Handling(is_finite, is_inf, is_nan)#59876
Implementing Null Handling(is_finite, is_inf, is_nan)#59876Rob12312368 wants to merge 7 commits intoray-project:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the is_nan expression for null handling, along with a corresponding test case. The implementation is a good start, but I've identified a high-severity issue with the return type of the is_nan function that could lead to schema inconsistencies. I've also suggested an improvement to the test coverage to handle more edge cases and a fix for an incorrect docstring in the test file. Addressing these points will improve the correctness and robustness of the new functionality.
| expected_results, | ||
| test_id, | ||
| ): | ||
| """Test arithmetic helper expressions: negate, sign, power, abs.""" |
There was a problem hiding this comment.
The docstring for this test function appears to be copied from another test and is incorrect. This test is for null handling operations, not arithmetic expressions. Please update the docstring to accurately describe its purpose.
| """Test arithmetic helper expressions: negate, sign, power, abs.""" | |
| """Test null handling helper expressions.""" |
| pytest.param( | ||
| [{"x": float("Nan")}, {"x": -3}, {"x": 0}], | ||
| lambda: col("x").is_nan(), | ||
| [True, False, False], | ||
| "is_nan", | ||
| ), |
There was a problem hiding this comment.
The test case for is_nan is a good start, but it could be more comprehensive. I suggest expanding it to include positive floats, infinity, negative infinity, and None (null) values to ensure the behavior is correct across more edge cases. pc.is_nan should return False for infinities and propagate nulls.
pytest.param(
[
{"x": float("nan")},
{"x": -3.0},
{"x": 0.0},
{"x": 3.14},
{"x": float("inf")},
{"x": float("-inf")},
{"x": None},
],
lambda: col("x").is_nan(),
[True, False, False, False, False, False, None],
"is_nan",
),…or discussing fill_null Signed-off-by: Rob12312368 <rob12312368@gmail.com>
2669ef1 to
b46766e
Compare
|
Hi, the ci did not pass, but I think it is unrelated to my patch. Should I do anything to proceed? @goutamvenkat-anyscale |
|
I am aware fill_null is missing. Will open another pr draft to discuss the approach. |
Thanks. Just merged master into your branch, hopefully there were some fixes to those affected tests. The cursor and gemini comments are valid and should be addressed. |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Tsao-Ching, Kao <56261402+Rob12312368@users.noreply.github.com>
Signed-off-by: Rob12312368 <rob12312368@gmail.com>
d8268ec to
d9f84a2
Compare
|
Hello @goutamvenkat-anyscale , I changed the code based on the suggestion. The CI failure does not seem to be related, though different this time. |
| @@ -1014,6 +1014,63 @@ async def __call__(self, x): | |||
| assert rows_same(result_df, expected_after_fix) | |||
|
|
|||
|
|
|||
| @pytest.mark.skipif( | |||
There was a problem hiding this comment.
Can you please move these tests to python/ray/data/tests/expressions/test_arithmetic.py
goutamvenkat-anyscale
left a comment
There was a problem hiding this comment.
lgtm, please just move the tests
I just merged master into your branch. It should resolve |
Signed-off-by: Rob12312368 <rob12312368@gmail.com>
| def is_inf(self) -> "UDFExpr": | ||
| return _create_pyarrow_compute_udf(pc.is_inf, return_dtype=DataType.bool())( | ||
| self | ||
| ) |
There was a problem hiding this comment.
Missing docstrings on new public API methods
Low Severity
The new is_nan, is_finite, and is_inf methods are the only public methods on this class without docstrings. Every other method — including short ones like ceil, floor, and trunc — has at least a one-line docstring. This inconsistency makes these methods harder to discover and use, especially since they're part of a public API.
Signed-off-by: Rob12312368 <rob12312368@gmail.com>


Description
Related issues
Cursor Bugbot reviewed your changes and found no issues for commit 58b3306