Skip to content

Optimize join ordering in N-table joins where N>2 #36

@arthurhsu

Description

@arthurhsu

Join ordering is a pretty heavy concept in query optimization. The available approaches can get very involved and maybe outside the scope of Lovefield.

As an example of different levels of addressing the issue consider the case of a 3 vs 4 table join.
A 3 table join graph has always the same chain structure as A -> B-> C, where edges in this graph represent a join condition between (leftTable, rightTable).
On the other hand, In a 4 table join case the graph can have different structure (chain, cycle, star, clique etc), and therefore optimizing such graphs is a harder problem.

Need to find the extend to which Lovefield should address the join ordering optimization issue. There are simplifications that can be made, for example considering
left-deep only trees.

At the very least, for the 3table join case, there should be no unnecessary cross-product operations, which with our current approach is not guaranteed (depends on the
order of the tables as supplied in the query).

Arthur researched and found that gridfile or related index will be much better choice in this case than B-Tree.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions