Skip to content

Commit 2921112

Browse files
committed
Rework CIP
- Clear separation between additive and replacing semantics - Additive semantics for nesting with {} - Replacing semantics for flat composition - Use THEN for discard cardinality - Use WITH|RETURN|YIELD NOTHING for discard fields
1 parent cc176e8 commit 2921112

File tree

1 file changed

+111
-89
lines changed

1 file changed

+111
-89
lines changed

cip/CIP2016-06-22-nested-subqueries.adoc cip/CIP2016-06-22-nested-updating-and-chained-subqueries.adoc

+111-89
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
= CIP2016-06-22 - Nested subqueries
1+
= CIP2016-06-22 - Nested, updating, and chained subqueries
22
:numbered:
33
:toc:
44
:toc-placement: macro
@@ -9,7 +9,7 @@
99
[abstract]
1010
.Abstract
1111
--
12-
This CIP proposes the incorporation of nested subqueries into Cypher.
12+
This CIP proposes the incorporation of nested, updating, and chained subqueries into Cypher.
1313
--
1414

1515
toc::[]
@@ -21,162 +21,165 @@ Subqueries - i.e. queries within queries - are a powerful and expressive feature
2121

2222
* Increased query expressivity
2323
* Better query construction and readability
24-
* Easier query composition and reuse
24+
* Easier composition of simple query pipelines
2525
* Post-processing results from multiple queries as a single unit
2626
* Performing a sequence of multiple write commands for each record
2727

2828
== Background
2929

30-
This CIP may be viewed in light of the EXISTS CIP, the Scalar Subqueries and List Subqueries CIP, and the Map Projection CIP, all of which propose variants of subqueries.
31-
In contrast, this CIP focusses on subqueries operating at a clause level while the EXISTS CIP and Map Projection CIP propose subqueries operating at an expression level.
30+
This CIP may be viewed in light of CIPs for query combinators and set operations, `EXISTS`, scalar subqueries, and list subqueries.
3231

3332
== Proposal
3433

35-
Nested subqueries are self-contained Cypher queries that are usually run within the scope of an outer Cypher query.
34+
Subqueries are self-contained Cypher queries that are usually run within the scope of an outer, containing Cypher query.
3635

37-
This proposal suggests the introduction of new nested subquery constructs to Cypher.
36+
This proposal suggests the introduction of new subquery constructs to Cypher.
3837

39-
* Read-only nested simple subqueries of the form `{ ... RETURN ... }`
40-
* Read-only nested chained subqueries of the form `THEN { ... RETURN ... }`
41-
* Read-only nested optional subqueries of the form `OPTIONAL { ... RETURN ... }`
42-
* Read-only nested mandatory subqueries of the form `MANDATORY { ... RETURN ... }`
43-
* Read/Write nested simple updating subqueries of the form `DO { ... }` (inner query not ending with `RETURN`)
44-
* Read/Write nested conditionally-updating subqueries of the form `DO [WHEN cond THEN { ... }]+ [ELSE { ... }] END` (inner queries not ending with `RETURN`)
38+
* Read-only nested subqueries
39+
** Read-only nested regular subqueries of the form `MATCH { <reading-query> }`
40+
** Read-only nested optional subqueries of the form `OPTIONAL MATCH { <reading-query> }`
41+
** Read-only nested mandatory subqueries of the form `MANDATORY MATCH { <reading-query> }`
42+
* Read/Write updating subqueries
43+
** Read/Write simple updating subqueries of the form `DO { <updating-query> }` (inner query not ending with `RETURN`)
44+
** Read/Write conditionally-updating subqueries of the form `DO [WHEN <predicate> THEN { <updating-query> }]+ [ELSE { <updating-query> }] END` (inner queries not ending with `RETURN`)
45+
* Chained subqueries
46+
** Chained data-dependent subqueries by extending the `WITH` projection clause that have the form `<query> <with-clause> <query>`. Additionally, this CIP proposes new shorthand syntax for starting a query with `WITH` to compose a query with external inputs.
47+
** Chained data-independent subqueries by introducing the new `THEN` clause for discarding all variables in scope as well as the cardinality of all input records. Additionally, this CIP proposes new shorthand syntax for discarding all variables in scope without discarding the cardinality of input records using `WITH|RETURN|YIELD NOTHING`.
4548

46-
A nested simple subquery consists of an inner query in curly braces.
49+
We additionally propose removing the `FOREACH` clause from the current language (it is rendered obsolete by the introduction of `DO`).
4750

48-
All other nested subquery constructs are introduced with a keyword in conjunction with an inner query in curly braces.
51+
Subquery constructs are always introduced with a keyword(s) in conjunction with an inner query in curly braces.
4952

50-
Nested subqueries may be correlated - i.e. the inner query may use variables from the outer query - or uncorrelated.
53+
Subqueries may be correlated - i.e. the inner query may use variables from the outer query - or uncorrelated.
5154

52-
Nested subqueries can be contained within other nested subqueries at an arbitrary (but finite) depth.
55+
Subqueries can be contained within other subqueries at an arbitrary (but finite) depth.
5356

54-
Read/Write nested subqueries cannot be contained within other read-only nested subqueries.
57+
Read/Write subqueries cannot be contained within other read-only subqueries.
5558

56-
Finally, this CIP proposes new shorthand syntax for starting a query with `WHERE`, along with the ability to specify that no fields are to be returned through the introduction of `WITH -`, `RETURN -`, and `YIELD -`.
5759

60+
=== Read-only nested subqueries
5861

59-
**1. Read-only nested simple subqueries**
62+
Conceptually, a nested subquery is evaluated for each incoming input record and may produce an arbitrary number of output records.
6063

61-
We propose the addition of read-only nested simple subqueries as a new form of read-only Cypher query.
64+
==== Read-only nested regular subqueries
6265

63-
A nested read-only simple subquery is denoted using the following syntax: `{ <inner-query> }`.
66+
We propose the addition of read-only nested regular subqueries as a new form of read-only Cypher query.
6467

65-
The inner query can be any complete read-only Cypher query.
68+
A nested read-only simple subquery is denoted using the following syntax: `MATCH { <inner-query> }`.
6669

67-
A nested read-only simple subquery may only be used as a primary clause, i.e. as a
70+
The inner query can be any complete read-only Cypher query.
6871

69-
* top-level Cypher query,
70-
* inner query of another nested subquery,
71-
* inner query of another expression-level subquery (such as a pattern comprehension, or an `EXISTS` subquery),
72-
* argument query to `UNION` and similar clause-level binary operators
72+
==== Read-only nested optional subqueries
7373

74-
A nested read-only simple subquery may not be used as a secondary clause after a preceding primary clause.
75-
(However, a nested read-only chained subquery may be used in this case.)
74+
We propose extending the `OPTIONAL MATCH` clause to express read-only nested optional subqueries.
7675

76+
A read-only nested optional subquery is denoted by the following syntax: `OPTIONAL MATCH { <inner-query> }`.
7777

78-
**2. Read-only nested chained subqueries**
78+
==== Read-only nested mandatory subqueries
7979

80-
We propose the addition of read-only nested chained subqueries for using nested subqueries in a similar position as a secondary clause.
81-
This is called _subquery chaining_.
80+
We propose extending the `MANDATORY MATCH` clause to express read-only nested mandatory subqueries.
8281

83-
After a chain of clauses that together form a query, a new nested chained subquery may be introduced as a secondary clause using the `THEN` keyword followed by an inner query in curly braces, i.e. it is denoted using the following syntax: `... THEN { <inner-query> }`.
84-
`THEN` is a query combinator and more details may be found in the Query Combinator CIP.
82+
A read-only nested mandatory subquery is denoted by the following syntax: `MANDATORY MATCH { <inner-query> }`.
8583

84+
==== Semantics
8685

87-
**3. Read-only nested optional subqueries**
86+
The nested subquery will be provided with all variables visible in the outer query as subquery input.
8887

89-
We propose the addition of a new `OPTIONAL` clause for expressing read-only nested optional subqueries.
88+
All records returned by the final `RETURN` clause of the subquery will be augmented with the variable bindings of the initial input record from the outer query to form the output records of the subquery.
89+
No other variable bindings will be added to the output records.
90+
If an incoming variable is either discarded or shadows within the subquery, an error will be raised if the subquery returns that variable to the outer query.
9091

91-
A read-only nested optional subquery is denoted by the following syntax: `OPTIONAL { <inner-query> }`.
92+
Finally, the result records of the different forms of nested subqueries are formed as follows:
9293

94+
* The result records of a read-only regular subquery are just the output records.
95+
* The result records of a read-only optional subquery are all the output records (if there is at least one output record), or a single record with the same fields as the output records where all newly introduced variable bindings are set to `NULL`.
96+
* The result records of a read-only mandatory subquery are just the output records. However, if the set of output records is empty, an error is raised in the same way as regular `MANDATORY MATCH`.
9397

94-
**4. Read-only nested mandatory subqueries**
98+
Nested subqueries interact with write clauses in the same way as `MATCH` does.
9599

96-
We propose the addition of a new `MANDATORY` clause for expressing read-only nested mandatory subqueries.
97100

98-
A read-only nested mandatory subquery is denoted by the following syntax: `MANDATORY { <inner-query> }`.
101+
=== Read/Write updating subqueries
99102

103+
Updating subqueries never change the cardinality; i.e. the inner update query is run for each incoming input record.
100104

101-
**4. Read/Write nested simple updating subqueries**
105+
==== Read/Write simple updating subqueries
102106

103-
We propose the addition of a new `DO` clause for expressing read/write nested simple updating subqueries that _do not return any data_.
107+
We propose the addition of a new `DO` clause for expressing read/write simple updating subqueries that _do not return any data_ from the inner query.
104108

105-
A read/write nested simple updating subquery is denoted by the following syntax: `DO { <inner-update-query> }`.
109+
A read/write simple updating subquery is denoted by the following syntax: `DO { <inner-update-query> }`.
106110

107111
Any updating Cypher query from which the trailing final `RETURN` clause has been omitted may be used as an inner update query.
108112

109-
We additionally propose removing the `FOREACH` clause from the current language as it is rendered obsolete by the introduction of `DO`.
110-
113+
A query may end with a `DO` subquery in the same way that a query can currently end with any update clause.
111114

112-
**5. Read/Write nested conditionally-updating subqueries**
115+
==== Read/Write conditionally-updating subqueries
113116

114-
We propose the addition of a second form of the `DO` clause for expressing read/write nested conditionally-updating subqueries that _do not return any data_.
117+
We propose the addition of a new conditional `DO` clause for expressing read/write conditionally-updating subqueries that _do not return any data_ from the inner query.
115118

116-
A read/write nested conditionally-updating subquery is denoted by the following syntax:
119+
A read/write conditionally-updating subquery is denoted by the following syntax:
117120

118121
```
119122
DO
120-
[WHEN <cond> THEN <inner-update-query>]+
123+
[WHEN <predicate> THEN <inner-update-query>]+
121124
[ELSE <inner-update-query>]
122125
END
123126
```
124127

125-
126128
Evaluation proceeds as follows:
127129

128-
* Semantically, the `WHEN` conditions are tested in the order given, and the inner updating query is executed for only the first condition that evaluates to `true`.
129-
* If no given `WHEN` condition evaluates to `true` and an `ELSE` branch is provided, the inner updating query of the `ELSE` branch is executed.
130-
* If no given `WHEN` condition evaluates to `true` and no `ELSE` branch is provided, no updates will be executed.
130+
* Semantically, the `WHEN` predicates are tested in the order given, and the inner updating query is executed for only the first predicate that evaluates to `true`.
131+
* If no given `WHEN` predicates evaluates to `true` and an `ELSE` branch is provided, the inner updating query of the `ELSE` branch is executed.
132+
* If no given `WHEN` predicates evaluates to `true` and no `ELSE` branch is provided, no updates will be executed.
131133

134+
A query may end with a conditional `DO` subquery in the same way that a query can currently end with any update clause.
132135

133-
**6. Shorthand syntax**
134136

135-
We propose the addition of a new clause `WHERE <cond> <subclauses>` as a shorthand syntax for `WITH * WHERE <cond> THEN { <subclauses> }`.
136-
The idea is for this to be used exclusively as a primary clause; for example, as the first clause of a nested subquery.
137+
=== Chained subqueries
137138

138-
We propose the addition of a new projection clauses of the form `WITH -` and `RETURN -`, which will retain the input cardinality but project no result fields.
139-
This allows for *only* checking the cardinality in a read-only nested mandatory subquery.
139+
==== Chained data-dependent subqueries
140140

141-
We propose the addition of a new subclause to `CALL` of the form `YIELD -`, which will retain the output cardinality of a call but project no result fields.
142-
This allows for *only* checking the cardinality in an `EXISTS` subquery.
141+
We propose extending the `WITH` projection clause to sequentially compose arbitrary queries to form a chained data-dependent subquery without resorting to nesting and indentation (e.g. as a short-hand syntax for post-UNION processing).
143142

143+
Chained data-dependent subqueries have the following general form `<Q1> WITH ... <Q2>`.
144144

145-
=== Semantic clarification
145+
Both `<Q1`> and `<Q2>` are arbitrary, complete Cypher queries.
146146

147-
**1. Read-only nested subqueries**
147+
Conceptually, the query `<Q2>` is evaluated for each incoming input record from the query `<Q1>` and may produce an arbitrary number of result records.
148+
In other words, the query `<Q2>` will be provided with all variables returned by the query `<Q1>` as input variable bindings.
148149

149-
Conceptually, a nested subquery is evaluated for each incoming record and may produce an arbitrary number of result records.
150+
Furthermore, this CIP proposes allowing a leading `WITH` to project variables from expressions that refer to unbound variables from the preceding scope (or query).
151+
This set of referenced, unbound variables of such a leading `WITH` is understood to implicitly declare the input variables required for the query to execute.
150152

151-
The rules regarding variable scoping are detailed as follows:
153+
Note:: This mechanism allows composing a Cypher query with inputs that have been constructed programmatically.
152154

153-
* All incoming variables remain in scope throughout the whole subquery.
154-
* When evaluating the subquery, any new variable bindings introduced by the final `RETURN` clause will augment the variable bindings of the initial record.
155-
* It is valid (though redundant) if incoming variables from the outer scope are passed on explicitly by any projection clause of the subquery (including the final `RETURN`).
156-
* Nested subqueries therefore cannot shadow variables present in the outer scope, and thus behave in the same way as `UNWIND` and `CALL` with regard to the introduction of new variable bindings.
157-
* Any other variable bindings that are introduced temporarily in the subquery will not be visible to the outer scope.
155+
==== Chained data-independent subqueries
158156

159-
Subqueries interact with write clauses in the same way as `MATCH` does.
157+
We propose introducing the `THEN` projection clause to sequentially compose two arbitrary subqueries to form a chained data-independent subquery without resorting to nesting and indentation.
160158

159+
Chained data-independent subqueries have the following general form `<Q1> THEN <Q2>`.
161160

162-
**2. Read/Write subqueries**
161+
Both `<Q1`> and `<Q2>` are arbitrary, complete Cypher queries.
162+
No variables and no input records are passed from `<Q1>` to `<Q2>`.
163+
Instead `<Q2>` is executed in a standalone fashion after the execution of `<Q1>` has finished.
163164

164-
Execution of a `DO` subquery does not change the cardinality; i.e. the inner update query is run for each incoming record.
165+
Furthermore, this CIP proposes allowing queries to start with a leading `THEN` for discarding all variables in scope as well as the cardinality of all input records provided by the surrounding execution environment.
165166

166-
Any input record is always passed on to the clause succeeding the `DO` subquery, irrespective of whether it was eligible for processing by any inner update query.
167+
Note:: This mechanism allows guaranteed execution of `<Q2>` irrespective of the number of records produced by `<Q1>`.
167168

168-
A `DO` clause that uses `WHEN` sub-clause is called a _conditional DO_.
169+
Note:: In general, `<Q1>` is expected to be an updating query and it is recommended that implementations generate a warning if this is not the case (to inform the user that `<Q1>` is essentially superfluous).
169170

170-
A query may end with a `DO` subquery in the same way that a query can currently end with any update clause.
171+
==== Discarding variables in scope
172+
173+
Finally, this CIP proposes new shorthand syntax for discarding all variables in scope without discarding the cardinality of input records using `WITH|RETURN|YIELD NOTHING`.
171174

172175
=== Examples
173176

174-
**1. Read-only nested simple and chained subqueries**
177+
==== Read-only nested regular subqueries
175178

176179
Post-UNION processing:
177180
[source, cypher]
178181
----
179-
{
182+
MATCH {
180183
// authored tweets
181184
MATCH (me:User {name: 'Alice'})-[:FOLLOWS]->(user:User),
182185
(user)<-[:AUTHORED]-(tweet:Tweet)
@@ -197,7 +200,7 @@ Uncorrelated nested subquery:
197200
[source, cypher]
198201
----
199202
MATCH (f:Farm {id: $farmId})
200-
THEN {
203+
MATCH {
201204
MATCH (u:User {id: $userId})-[:LIKES]->(b:Brand),
202205
(b)-[:PRODUCES]->(p:Lawnmower)
203206
RETURN b.name AS name, p.code AS code
@@ -214,7 +217,7 @@ Correlated nested subquery:
214217
[source, cypher]
215218
----
216219
MATCH (f:Farm {id: $farmId})-[:IS_IN]->(country:Country)
217-
THEN {
220+
MATCH {
218221
MATCH (u:User {id: $userId})-[:LIKES]->(b:Brand),
219222
(b)-[:PRODUCES]->(p:Lawnmower)
220223
RETURN b.name AS name, p.code AS code
@@ -233,7 +236,7 @@ Filtered and correlated nested subquery:
233236
----
234237
MATCH (f:Farm)-[:IS_IN]->(country:Country)
235238
WHERE country.name IN $countryNames
236-
THEN {
239+
MATCH {
237240
MATCH (u:User {id: $userId})-[:LIKES]->(b:Brand),
238241
(b)-[:PRODUCES]->(p:Lawnmower)
239242
RETURN b AS brand, p.code AS code
@@ -253,9 +256,9 @@ Doubly-nested subquery:
253256
[source, cypher]
254257
----
255258
MATCH (f:Farm {id: $farmId})
256-
THEN {
259+
MATCH {
257260
MATCH (c:Customer)-[:BUYS_FOOD_AT]->(f)
258-
THEN {
261+
MATCH {
259262
MATCH (c)-[:RETWEETS]->(t:Tweet)<-[:TWEETED_BY]-(f)
260263
RETURN c, count(*) AS count
261264
UNION
@@ -271,23 +274,23 @@ THEN {
271274
RETURN f.name AS name, type, sum(endorsement) AS endorsement
272275
----
273276

274-
**2. Read-only nested optional match and mandatory subqueries**
277+
===== Read-only nested optional and mandatory subqueries
275278

276279
This proposal also provides nested subquery forms of `OPTIONAL MATCH` and `MANDATORY MATCH`:
277280

278281
[source, cypher]
279282
----
280283
MANDATORY MATCH (p:Person {name: 'Petra'})
281284
MANDATORY MATCH (conf:Conference {name: $conf})
282-
MANDATORY {
283-
WHERE conf.impact > 5
285+
MANDATORY MATCH {
286+
WITH * WHERE conf.impact > 5
284287
MATCH (p)-[:ATTENDS]->(conf)
285288
RETURN conf
286289
UNION
287290
MATCH (p)-[:LIVES_IN]->(:City)<-[:IN]-(conf)
288291
RETURN conf
289292
}
290-
OPTIONAL {
293+
OPTIONAL MATCH {
291294
MATCH (p)-[:KNOWS]->(a:Attendee)-[:PUBLISHED_AT]->(conf)
292295
RETURN a.name AS name
293296
UNION
@@ -298,7 +301,7 @@ RETURN name
298301
----
299302

300303

301-
**3. Read/Write nested simple and conditionally-updating subqueries**
304+
==== Read/Write simple updating and conditionally-updating subqueries
302305

303306
We illustrate these by means of an 'old' version of the query, in which `FOREACH` is used, followed by the 'new' version, using `DO`.
304307

@@ -376,12 +379,31 @@ DO WHEN x % 2 = 1 THEN {
376379
END
377380
----
378381

382+
==== Chained subqueries
383+
384+
Combining nested and chained subqueries
385+
[source, cypher]
386+
----
387+
MATCH (x)-[:IN]->(:Category {name: "A"})
388+
WITH x LIMIT 5
389+
MATCH (x)-[:FROM]-(c :City)
390+
RETURN x, c
391+
UNION
392+
MATCH (x)-[:IN]->(:Category {name: "A"})
393+
WITH x LIMIT 10
394+
MATCH (x)-[:FROM]-(c :City)
395+
// This finished the right arm of the UNION
396+
RETURN x, c
397+
// This applies to the whole UNION
398+
WITH x.name AS name ORDER BY x.age
399+
RETURN x LIMIT 10
400+
----
379401

380402
=== Interaction with existing features
381403

382404
Apart from the suggested deprecation of the `FOREACH` clause, nested read-only, write-only and read-write subqueries do not interact directly with any existing features.
383405

384-
=== Alternatives
406+
== Alternatives
385407

386408
Alternative syntax has been considered during the production of this document:
387409

0 commit comments

Comments
 (0)