Skip to content

feat(join): support _.col.upper() and lambda left, right: ... as predicates #10703

Open
@NickCrews

Description

@NickCrews

Is your feature request related to a problem?

I have these two tables, which I want to join regardless of capitalization. I want to be able to do this:

import ibis
tl = ibis.memtable({"f": ["a", "b", "c"]})
tr = ibis.memtable({"f": ["A", "B", "Z"]})
tl.join(tr, ibis._.f.upper())
# expect {"f": ["a", "b"], "f_right": ["A", "B"]}
# Currently get `SignatureValidationError: JoinLink(...)`

Currently I have to actually pass tl.f.upper() == tr.f.upper().

Another problem I have is reusing my predicates. I want to be able to do all of these:

def equalish(col: str):
    def pred(left: ibis.Table, right: ibis.Table):
        return (left[col] == right[col]) | (left[col].isnull() & right[col].isnull())
   return pred

preds = [
    "a",
    _.x.upper(),
    lambda t: t.y.abs(),
    lambda l, r: l.foo == r.bar,
    (_.baz, _.qux.upper()),
    equalish("x"),
    equalish("y"),
]
a.join(b, preds)
c.join(d, preds)

Currently, I have to reference the tables directly in the predicates, so I have to actually do the binding at right-before-join time, which prevents me from being able to reuse them.

def equalish(left, right, col):
    return (left[col] == right[col]) | (left[col].isnull() & right[col].isnull())

a.join(b, [equalish(a, b, "x"), equalish(a, b, "y")])
c.join(d, [equalish(c, d, "x"), equalish(c, d, "y")])

Describe the solution you'd like

See below

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureFeatures or general enhancements

    Type

    No type

    Projects

    • Status

      backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions