[Enh]: A richer `Expr` internal representation

## Related
This cuts across many issues/previous discussions, these are some I've seen pretty clear links to:
- https://github.com/narwhals-dev/narwhals/pull/2483#issuecomment-2866902903
- https://github.com/narwhals-dev/narwhals/pull/2483#issuecomment-2867331343
- https://github.com/narwhals-dev/narwhals/pull/2483#issuecomment-2867446959
- https://github.com/narwhals-dev/narwhals/pull/2483#issuecomment-2869070157
- (https://github.com/narwhals-dev/narwhals/pull/2538/commits/a7eeb0d23e67cb70e7cfa73cec2c7b69a15c8bef#r2083562677)
- https://github.com/narwhals-dev/narwhals/issues/2534#issuecomment-2875676729
- https://github.com/narwhals-dev/narwhals/pull/2555

### Issues
- [ ] #1848
- [ ] #2225
- [ ] #2291
- [ ] #2534
- [x] #2623
- [x] #2652

## Description
Following (https://github.com/narwhals-dev/narwhals/pull/2483#issuecomment-2866902903) I've been sneakily incubating this idea that I feel *now* has enough legs to share 😏.

The big picture is replacing [`ExprMetadata`](https://github.com/narwhals-dev/narwhals/blob/c7a6080430f12c43dc37be325ee6f467fd510796/narwhals/_expression_parsing.py#L229-L567) with a rich, node-based representation - that encodes **every step** of an `Expr`.
This might seem like a huge leap to take - but we're in a unique position where a [battle-tested solution](https://github.com/pola-rs/polars/blob/4d9900b3ca22bd3a6a684411d52b11d9a66435ad/crates/polars-plan/src/dsl/expr/mod.rs) already exists 😎 

So far I've been mapping out what this could look like on ([`oh-nodes`](https://github.com/narwhals-dev/narwhals/tree/oh-nodes)).
~~I'm going to open a PR~~ I've opened (#2572) so anyone feel free to comment on that or on this thread.


## Notes
I wrote up a bit of guiding mantra of what I'm hoping the end result should look like:
- Each `Expr` method should be representable by a single node
 - But the node does not need to be unique to the method
- A chain of `Expr` methods should form a plan of operations
- We must be able to enforce rules on what plans are permitted:
 - Must be flexible to both eager/lazy and individual backends
 - Must be flexible to a given context (select, with_columns, filter, group_by)
- Nodes are:
 - Immutable, but
 - Can be extended/re-written at both the Narwhals & Compliant levels
 - Introspectable, but
 - Store as little-as-needed for the common case
 - Provide properties/methods for computing the less frequent metadata

## Examples
As everything is represented by different classes - this one is a bit visual 😄 


### Column selections
```py
from narwhals._plan import demo as nwd

>>> nwd.col("a"), nwd.col("b", "c"), nwd.nth(1), nwd.nth(3, 4, 5)
(Narwhals DummyExpr:
 col('a'),
 Narwhals DummyExpr:
 cols(['b', 'c']),
 Narwhals DummyExpr:
 nth(1),
 Narwhals DummyExpr:
 index_columns((3, 4, 5)))
```

### Literals
We can discern the kind of literal that's wrapped:
```py
import polars as pl

from narwhals._plan import demo as nwd
from narwhals._plan.dummy import DummySeries

series = DummySeries.from_native(pl.Series([1.1, 1.2]))
scalar = nwd.lit(5)
series = nwd.lit(series)
```
```py
>>> scalar, series
(Narwhals DummyExpr:
 lit(int: 5),
 Narwhals DummyExpr:
 lit(Series))
```

```py
>>> scalar._ir.is_scalar, series._ir.is_scalar
(True, False)
```

### Funky
How about something more complex?
```py
import narwhals as nw
from narwhals._plan import demo as nwd

>>> nwd.col("a").alias("b").cast(nw.Int8()).n_unique() + (nwd.col("c").count() * nwd.lit(10))
Narwhals DummyExpr:
[(col('a').alias('b').cast(Int8).n_unique()) + ([(col('c').count()) * (lit(int: 10))])]
```

### Order dependence
Here I'm trying to enforce the rules from (https://github.com/narwhals-dev/narwhals/pull/2528#discussion_r2083557149).

The idea would be that we allow the last two variants in lazy backends:
```py
import narwhals as nw
from narwhals._plan import demo as nwd

orderable_1 = nwd.col("a").alias("d").first()
orderable_2 = nwd.col("b").cast(nw.String()).last()
orderable_3 = nwd.col("c").sort_by(nwd.col("e")).first()
orderable_4 = nwd.col("d").last().over(order_by=nwd.col("e", "f", "g"))
```

The outputs include suggestions that use the actual `Expr`

<details><summary><code>orderable_1</code></summary>


```py
>>> nwd.ensure_orderable_rules(orderable_1)
OrderDependentExprError: first() is order-dependent and requires an ordering operation for lazy backends.
Hint:
Instead of:
 col('a').alias('d').first()

If you want to aggregate to a single value, try:
 col('a').alias('d').sort_by(...).first()

Otherwise, try:
 col('a').alias('d').first().over(order_by=...)
```


</details>

<details><summary><code>orderable_2</code></summary>


```py
>>> nwd.ensure_orderable_rules(orderable_2)
OrderDependentExprError: last() is order-dependent and requires an ordering operation for lazy backends.
Hint:
Instead of:
 col('b').cast(String).last()

If you want to aggregate to a single value, try:
 col('b').cast(String).sort_by(...).last()

Otherwise, try:
 col('b').cast(String).last().over(order_by=...)
```


</details>


<details><summary><code>orderable_3</code></summary>


```py
>>> nwd.ensure_orderable_rules(orderable_3)[0]
Narwhals DummyExpr:
col('c').sort_by(by=(col('e'),), options=SortMultipleOptions(descending=[False], nulls_last=[False])).first()
```


</details>

<details><summary><code>orderable_4</code></summary>


```py
>>> nwd.ensure_orderable_rules(orderable_4)[0]
Narwhals DummyExpr:
col('d').last().over(order_by=[cols(['e', 'f', 'g'])])
```


</details>

### Is this just a fancy `__repr__`?
The repr is mostly a nice side effect of what is happening under the hood.
Take this example of a pretty complex expression:

```py
import narwhals as nw
from narwhals._plan import demo as nwd

lhs = nwd.col("a").alias("b").cast(nw.Int8()).n_unique()
rhs = nwd.col("c").count() * nwd.lit(10)
result = (lhs + rhs).last().over(order_by=nwd.col("d", "e", "f"), descending=True)
```

Another option for introspection is via [`ExprIR.__str__`](https://github.com/narwhals-dev/narwhals/blob/ebd05428b755f83212c9ca24d5ecbf3896bda88a/narwhals/_plan/common.py#L89-L93)

```py
node = result._ir
>>> str(node)
```
This is the meat of the idea - as we've got something like what you'd see in [`ast`](https://docs.python.org/3/library/ast.html)

<details><summary>Raw output</summary>


> "WindowExpr(expr=Last(expr=BinaryExpr(left=NUnique(expr=Cast(dtype=Int8, expr=Alias(expr=Column(name=a), name=b))), op=Add(), right=BinaryExpr(left=Count(expr=Column(name=c)), op=Multiply(), right=Literal(value=ScalarLiteral(dtype=Unknown, value=10))))), partition_by=(), order_by=((cols(['d', 'e', 'f']),), SortOptions(descending=True, nulls_last=False)), options=Over())"


</details> 


<details><summary>Ruff'd output</summary>


```py
WindowExpr(
 expr=Last(
 expr=BinaryExpr(
 left=NUnique(expr=Cast(dtype=Int8, expr=Alias(expr=Column(name=a), name=b))),
 op=Add(),
 right=BinaryExpr(
 left=Count(expr=Column(name=c)),
 op=Multiply(),
 right=Literal(value=ScalarLiteral(dtype=Unknown, value=10)),
 ),
 )
 ),
 partition_by=(),
 order_by=((cols(["d", "e", "f"]),), SortOptions(descending=True, nulls_last=False)),
 options=Over(),
)
```


</details> 

## References
<details><summary>Lots of links to polars source</summary>


- https://github.com/pola-rs/polars/blob/dafd0a2d0e32b52bcfa4273bffdd6071a0d5977a/crates/polars-python/src/lazyframe/visitor/expr_nodes.rs
- https://github.com/pola-rs/polars/blob/dafd0a2d0e32b52bcfa4273bffdd6071a0d5977a/crates/polars-plan/src/dsl/expr.rs
- https://github.com/pola-rs/polars/blob/dafd0a2d0e32b52bcfa4273bffdd6071a0d5977a/crates/polars-plan/src/dsl/function_expr/mod.rs
- https://github.com/pola-rs/polars/blob/dafd0a2d0e32b52bcfa4273bffdd6071a0d5977a/crates/polars-plan/src/dsl/options/mod.rs#L137-L172
- https://github.com/pola-rs/polars/blob/3fd7ecc5f9de95f62b70ea718e7e5dbf951b6d1c/crates/polars-plan/src/plans/options.rs#L35-L106
- https://github.com/pola-rs/polars/blob/3fd7ecc5f9de95f62b70ea718e7e5dbf951b6d1c/crates/polars-plan/src/plans/options.rs#L131-L236
- https://github.com/pola-rs/polars/blob/3fd7ecc5f9de95f62b70ea718e7e5dbf951b6d1c/crates/polars-plan/src/plans/options.rs#L240-L267
- https://github.com/pola-rs/polars/blob/6df23a09a81c640c21788607611e09d9f43b1abc/crates/polars-plan/src/plans/aexpr/mod.rs



</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Enh]: A richer `Expr` internal representation #2571

Related

Issues

Description

Notes

Examples

Column selections

Literals

Funky

Order dependence

Is this just a fancy `repr`?

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Enh]: A richer Expr internal representation #2571

Description

Related

Issues

Description

Notes

Examples

Column selections

Literals

Funky

Order dependence

Is this just a fancy __repr__?

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Enh]: A richer `Expr` internal representation #2571

Is this just a fancy `repr`?