Skip to content

feat(RFC): A richer Expr IR #2572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 251 commits into
base: main
Choose a base branch
from
Draft

feat(RFC): A richer Expr IR #2572

wants to merge 251 commits into from

Conversation

dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented May 18, 2025

Will close #2571

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

Important

See (#2571) for detail!!!!!!!

What's here so far is mostly mocking up the solution - not actually implementing it yet.
Very open to feedback

Tasks

Thinking it might go:
`Op` -> `Node` -> `Plan`

But who knows really
Just trying to get the names & hierarchies done
- Will probably want to set up the `object.__setattr__` part somewhere
- Not decided on how initialization should work
- Gonna need to revisit `Function` vs `FunctionExpr`
- `functions.py` is confusing me now
Starting understand `Function` vs `FunctionExpr` a bit better now
- Probably going to loop back and explcitly define more input fields
- At least until we accept `Expr` everywhere, these should more closely reflect `narwhals` than `polars`
Once `result: ResultIRs` is made immutable (or mutability stays within function boundaries) - most of `expr_expansion` will be safe to cache
Comment on lines +434 to +446
def expand_columns(
origin: ExprIR, /, columns: Columns, *, exclude: Excluded
) -> Seq[ExprIR]:
if not _all_columns_match(origin, columns):
msg = "expanding more than one `col` is not allowed"
raise ComputeError(msg)
result: deque[ExprIR] = deque()
for name in columns.names:
if name not in exclude:
expanded = replace_with_column(origin, Columns, name)
expanded = rewrite_special_aliases(expanded)
result.append(expanded)
return tuple(result)
Copy link
Member Author

@dangotbanned dangotbanned Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Plan how to handle exceptions & caching

All of these can raise currently:

  • expand_columns
  • expand_indices
  • replace_wildcard
  • rewrite_special_aliases

IIRC, the issues are:

  • Raising an exception within a cached function
    • I think makes the next (matching) call return None instead of raising again
    • No idea where I got that ^ from, seems to be working fine
  • Returning the exception
    • I think there's an issue of raising a previously raised exception
    • Possible solution is introducing a result type that always reconstructs the exception before raising
      • Maybe I've been reading too much rust πŸ˜…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enh]: A richer Expr internal representation
2 participants