Skip to content

feat(join): support _.col.upper() and lambda left, right: ... as predicates #10703

Open
@NickCrews

Description

@NickCrews

Is your feature request related to a problem?

I have these two tables, which I want to join regardless of capitalization. I want to be able to do this:

import ibis
tl = ibis.memtable({"f": ["a", "b", "c"]})
tr = ibis.memtable({"f": ["A", "B", "Z"]})
tl.join(tr, ibis._.f.upper())
# expect {"f": ["a", "b"], "f_right": ["A", "B"]}
# Currently get `SignatureValidationError: JoinLink(...)`

Currently I have to actually pass tl.f.upper() == tr.f.upper().

Another problem I have is reusing my predicates. I want to be able to do all of these:

def equalish(col: str):
    def pred(left: ibis.Table, right: ibis.Table):
        return (left[col] == right[col]) | (left[col].isnull() & right[col].isnull())
   return pred

preds = [
    "a",
    _.x.upper(),
    lambda t: t.y.abs(),
    lambda l, r: l.foo == r.bar,
    (_.baz, _.qux.upper()),
    equalish("x"),
    equalish("y"),
]
a.join(b, preds)
c.join(d, preds)

Currently, I have to reference the tables directly in the predicates, so I have to actually do the binding at right-before-join time, which prevents me from being able to reuse them.

def equalish(left, right, col):
    return (left[col] == right[col]) | (left[col].isnull() & right[col].isnull())

a.join(b, [equalish(a, b, "x"), equalish(a, b, "y")])
c.join(d, [equalish(c, d, "x"), equalish(c, d, "y")])

Describe the solution you'd like

See below

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureFeatures or general enhancements

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions