Skip to content

Not all optimal tpch queries plans are really optimal #635

@MBkkt

Description

@MBkkt

Intro

Right now, according to 1314950

Queries TPC-H q1, 3, 4, 6, 12, 14, 16 are optimal, but actually some of them are not.

Queries

  1. q1, q6, q12, q14 -- are ok
  2. q3 -- ideally should have Velox operator that allows to combine last hashjoin (lineitem x (orders x customer)) with aggregation after
  3. q4 -- we should use right semi filter join instead of right semi project join + filter
  4. q16 -- we should use left anti join instead of left semi project join + filter not

Conclusion

q3 doesn't looks like a bug, just improvement that can be done. First in Velox, then in Axiom.

q4, q16 looks like same issue, we should try to avoid semi project when it's possible.

Questions

Why doesn't Velox have right anti join?

CC @mbasmanova @pashandor789

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions