Description
Problem
Currently, we execute joins in the order they were given by the user. If the user does not pay attention, this can cause a significant performance penalty due to an unnecessary explosion of intermediate results.
Solution
We should automatically optimize the join ordering. Ideally, we have cardinality estimates for this from Parquet files or a metadata store, but we should also try to optimize join ordering without meaningful statistics. Possible approaches here include optimization based on partition counts or equivalence sets (https://blobs.duckdb.org/papers/tom-ebergen-msc-thesis-join-order-optimization-with-almost-no-statistics.pdf).
Previous work
We have already experimented with this, but never gotten to a working solution that we could merge (e.g., #809).