Skip to content

feat: modify rank so that it works with over() for lazy backends #2533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
May 19, 2025

Conversation

raisadz
Copy link
Contributor

@raisadz raisadz commented May 11, 2025

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

  • Related issue #<issue number>
  • Closes #<issue number>

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

@raisadz raisadz marked this pull request as ready for review May 13, 2025 17:48
@MarcoGorelli MarcoGorelli added the enhancement New feature or request label May 14, 2025
@MarcoGorelli
Copy link
Member

really nice, thanks!

i'm pleasantly surprised that for ibis it seems to just work without requiring adjustments on our end, nice

@FBruzzesi FBruzzesi added duckdb Issue is related to duckdb backend spark-like labels May 14, 2025
Comment on lines 791 to 824
def _rank(_input: Column) -> Column:
order_by = self._sort(_input, descending=descending, nulls_last=True)
window = self.partition_by().orderBy(*order_by)
count_window = self.partition_by(_input)
def _rank(_input: Column, window: WindowSpec, count_window: WindowSpec) -> Column:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way you've done it in _duckdb is nice, where you share order_by between the two functions

Could we do that here too, so that the signatures are aligned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @MarcoGorelli ! Yes makes sense, I aligned them now

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! this looks close, just a couple of nitpicks sorry 😳

*,
descending: bool,
partition_by_sql: str | None = None,
) -> duckdb.Expression:
if descending:
by_sql = f"{_input} desc nulls last"
else:
by_sql = f"{_input} asc nulls last"
order_by_sql = f"order by {by_sql}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be clearer for this order_by_sql to go in the else block (line 784)?

@@ -63,6 +65,7 @@ class DuckDBExpr(LazyExpr["DuckDBLazyFrame", "duckdb.Expression"]):
def __init__(
self,
call: EvalSeries[DuckDBLazyFrame, duckdb.Expression],
previous_call: EvalSeries[DuckDBLazyFrame, duckdb.Expression] | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be an argument to DuckDBExpr? I think it can be removed, as it's set in _with_unorderable_window_function

@@ -78,6 +80,7 @@ class SparkLikeExpr(LazyExpr["SparkLikeLazyFrame", "Column"]):
def __init__(
self,
call: EvalSeries[SparkLikeLazyFrame, Column],
previous_call: EvalSeries[SparkLikeLazyFrame, Column] | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just added a commit to further align logic between spark-like and duckdb

thanks, looks good!

leaving open a bit in case others have comments, but i'm happy to finally be able to do rank(...).over(...)!

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @raisadz - is_unique followed by over still needs doing, just flagging it in case it interests you to take a look at that too?

@MarcoGorelli MarcoGorelli merged commit f399bb2 into narwhals-dev:main May 19, 2025
32 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duckdb Issue is related to duckdb backend enhancement New feature or request spark-like
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants