Support LEFT OUTER JOIN and RIGHT OUTER JOIN#4122
Open
RobertBrunel wants to merge 3 commits into
Open
Conversation
alecgrieser
reviewed
May 6, 2026
Collaborator
alecgrieser
left a comment
There was a problem hiding this comment.
This seems like in general a sound approach to me, and it appears to be correct in its aims. I have a few questions about the approach, as well as some documentation and testing suggestions. There are some comments in the OuterJoinExpression code that could probably be answered with either additional comments (perhaps in the PR or perhaps just in GitHub) or potentially more substantive code changes
RobertBrunel
added a commit
to RobertBrunel/fdb-record-layer
that referenced
this pull request
May 8, 2026
* General cleanup pass over `INNER_JOIN.rst` and `Joins.rst`. * Rename `INNER_JOIN.rst` to `JOIN.rst`. * Add documentation of LEFT OUTER JOIN to `JOIN.rst`. Support for left joins is introduced by PR FoundationDB#4122.
80dd43a to
76e80f0
Compare
ab490f9 to
1659dfa
Compare
…ectness `PullUpNullOnEmptyRule` splits a `SelectExpression` featuring a null-on-empty quantifier into two selects. However, it assigns the predicates only to the lower select. To be correct, it needs to apply them to the upper `SelectExpression` as well (“to act on any nulls produced by this quantifier”, as the Javadoc comment on the rule already says). Without this bugfix, WHERE predicates may get incorrectly pushed past the null-on-empty boundary. Testing: * Introduce `PullUpNullOnEmptyRuleTest` and add a regression test. Fixes FoundationDB#4148.
This change introduces support for left and right outer joins. A dedicated QGM box (`OuterJoinExpression`) represents one outer join. It is strictly binary (unlike the SELECT box) and carries the join type (LEFT/RIGHT/FULL), the ON-clause predicates, and a reference to the “preserved” and the “null-supplying” quantifier. During the rewriting phase, an exploration rule `RewriteOuterJoinRule` canonicalizes the outer join box into two nested select boxes: * The preserved side is connected to the outer `SelectExpression` through a normal FOREACH quantifier. * The null-supplying side is wrapped in an inner `SelectExpression` carrying the ON predicates and is connected through a FOREACH quantifier with `nullOnEmpty` set to true. The rewrite happens during canonicalization so that all subsequent planning rules (predicate push-down, join ordering, implementation) handle the join as a normal nested SELECT. No other rules need to know about `OuterJoinExpression`. The `RewritingCostModel` penalizes any surviving `OuterJoinExpression`, ensuring the rewritten form always wins. Key changes: * `OuterJoinExpression` represents the outer join in the QGM. * `QueryVisitor` parses the `OUTER JOIN` syntax and constructs the logical `OuterJoinExpression`. * `RewriteOuterJoinRule` rewrites the `OuterJoinExpression` into nested `SelectExpression` boxes; it is registered in `RewritingRuleSet` so it fires during the canonicalization phase. * `RewritingCostModel` consults a new `ExpressionCountProperty.outerJoinCount` ahead of `selectCount` so an un-rewritten `OuterJoinExpression` is always more expensive than the canonical two-SELECT form. * Supporting changes in `CardinalitiesProperty`, `RecordTypesProperty`, `LogicalPlanFragment`, and `RelationalExpressionMatchers` to teach existing utilities about `OuterJoinExpression`, so cardinality, ordering, and record-type properties propagate correctly through the new node. Testing: * `join-tests-outer.yamsql` integration tests covering LEFT and RIGHT OUTER JOIN semantics, anti-join patterns, predicate placement (ON vs WHERE) on either side of the join, and predicate push-down into either source. Resolves FoundationDB#4151.
📊 Metrics Diff Analysis ReportSummary
ℹ️ About this analysisThis automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:
The last category in particular may indicate planner regressions that should be investigated. New QueriesCount of new queries by file:
|
alecgrieser
approved these changes
May 12, 2026
RobertBrunel
added a commit
to RobertBrunel/fdb-record-layer
that referenced
this pull request
May 13, 2026
* General cleanup pass over `INNER_JOIN.rst` and `Joins.rst`. * Rename `INNER_JOIN.rst` to `JOIN.rst`. * Add documentation of OUTER JOIN to `JOIN.rst`. Support for outer joins is introduced in PR FoundationDB#4122.
RobertBrunel
commented
May 15, 2026
normen662
requested changes
May 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Support LEFT OUTER JOIN and RIGHT OUTER JOIN
This change introduces support for left and right outer joins.
A dedicated QGM box (
OuterJoinExpression) represents one outer join. It is strictly binary (unlike the SELECT box) and carries the join type (LEFT/RIGHT/FULL), the ON-clause predicates, and a reference to the “preserved” and the “null-supplying” quantifier. During the rewriting phase, an exploration ruleRewriteOuterJoinRulecanonicalizes the outer join box into two nested select boxes:SelectExpressionthrough a normal FOREACH quantifier.SelectExpressioncarrying the ON predicates and is connected through a FOREACH quantifier withnullOnEmptyset to true.The rewrite happens during canonicalization so that all subsequent planning rules (predicate push-down, join ordering, implementation) handle the join as a normal nested SELECT. No other rules need to know about
OuterJoinExpression. TheRewritingCostModelpenalizes any survivingOuterJoinExpression, ensuring the rewritten form always wins.Key changes:
OuterJoinExpressionrepresents the outer join in the QGM.QueryVisitorparses theOUTER JOINsyntax and constructs the logicalOuterJoinExpression.RewriteOuterJoinRulerewrites theOuterJoinExpressioninto nestedSelectExpressionboxes; it is registered inRewritingRuleSetso it fires during the canonicalization phase.RewritingCostModelconsults a newExpressionCountProperty.outerJoinCountahead ofselectCountso an un-rewrittenOuterJoinExpressionis always more expensive than the canonical two-SELECT form.CardinalitiesProperty,RecordTypesProperty,LogicalPlanFragment, andRelationalExpressionMatchersto teach existing utilities aboutOuterJoinExpression, so cardinality, ordering, and record-type properties propagate correctly through the new node.Testing:
join-tests-outer.yamsqlintegration tests covering LEFT and RIGHT OUTER JOIN semantics, anti-join patterns, predicate placement (ON vs WHERE) on either side of the join, and predicate push-down into either source.Resolves #4151.