Fix OuterJoinRemover so it does not optimize too aggressively#27
Fix OuterJoinRemover so it does not optimize too aggressively#27dallmair wants to merge 1 commit intoterrajobst:mainfrom
Conversation
Here's a query that exhibits different behavior:
SELECT o.OrderID
FROM Orders o
WHERE (
SELECT TOP 1 od2.ShipVia
FROM Orders od2
WHERE od2.OrderID > o.OrderID
) IS NOT NULL
Previously, this failed with a KeyNotFoundException when evaluated,
because the OuterJoinRemover erroneously changed the left semi join to
an inner join.
Reasoning: The OuterJoinRemover checked for any null-rejecting column
anywhere in the query, but it needs to check for only those columns
that are null-rejecting in the respective subtree, i.e. children of the
join node.
|
Thanks! The outer join definitely shouldn't be replaced by an inner join but there is a larger issue here: the code shouldn't crash with a key not found exception. I think the problem is the join orderer which seems to mess up cases where outer references are included. Outer references are only allowed to appear on the right hand side of a join (i.e. right can reference left, but not the other way around). However, the orderer swaps left and right in an illegal fashion, causing the outer reference to become unresolvable later. I'm not entirely sure what the fix for this should be yet. I could either disallow join ordering entirely for joins with outer references or try to handle them when reassembling the join graph. Skipping joins with outer references might be the easiest fix but my concern is that it might stop optimizing joins in certain queries entirely. Alternatively I might ignore the only-left-can-reference-right constraint during optimization entirely and introduce a late phase that enforces it by swapping left and right when necessary. |
0bcf6eb to
12ca856
Compare
Here's a query that exhibits different behavior:
Previously, this failed with a KeyNotFoundException when evaluated,
because the OuterJoinRemover erroneously changed the left semi join to
an inner join.
Reasoning: The OuterJoinRemover checked for any null-rejecting column
anywhere in the query, but it needs to check for only those columns
that are null-rejecting in the respective subtree, i.e. children of the
join node.