You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: cip/CIP2017-06-18-multiple-graphs.adoc
+102-51
Original file line number
Diff line number
Diff line change
@@ -15,12 +15,7 @@ This CIP proposes to extend Cypher with support for the construction, transforma
15
15
toc::[]
16
16
17
17
```
18
-
TODO:
19
-
20
-
* Parameter handling
21
-
* Graph name syntax
22
-
* Precise update semantics
23
-
* Entity identity
18
+
* Composition Semantics
24
19
```
25
20
26
21
== Motivation
@@ -68,12 +63,24 @@ An entity is considered to be deleted if it is no longer part of any graph.
68
63
69
64
=== Graph Addressing
70
65
71
-
Graphs do not expose an identity like nodes or relationships. They may however be made addressable through other means by a conforming implementation (e.g. through exposing the graph under a _Graph URL_).
66
+
Graphs do not expose an identity like nodes or relationships do.
67
+
68
+
Graphs may be made addressable through other means by a conforming implementation (e.g. through exposing the graph under a _graph URL_ for referencing and loading it).
69
+
The details regarding the format and choice of graph URLs is outside the scope of this proposal.
70
+
71
+
A graph is considered to have been deleted if it is no longer registered under a graph URL and no other reference to it is retained (e.g. from a running query).
72
+
73
+
=== Entity Identity
72
74
73
-
The details of such a mechanism are out of scope of this proposal.
75
+
In the single property graph model, nodes and relationships are commonly identified by a single integer id.
76
+
This model was originally not designed for sharing entities between many different graphs while ensuring that entity ids are unique.
74
77
75
-
However, a graph is considered to have been deleted if it is no longer registered under a Graph URL and no other reference to it is retained (e.g. from a running query).
78
+
In the multiple property graphs model, entities are additionally implicitly associated with a _graph space_ that allows to distinguish between entities with the same original id from different sources (e.g. different databases or even snapshots of the same database).
76
79
80
+
In the multiple property graphs model, no graph may contain two entities from the same graph space that have the same original id.
81
+
82
+
Graph spaces may be made identifiable by a conforming implementation by assigning a _graph URI_ to them.
83
+
The details regarding the format and choice of graph URIs is outside the scope of this proposal.
77
84
78
85
== Background: Single Graph Execution Model
79
86
@@ -109,8 +116,10 @@ This CIP proposes to redefine the *execution context* to be
109
116
This CIP proposes to redefine the *query context* to be
110
117
111
118
* a set of named graphs from the *execution context*
112
-
* an optional information that indicates which of these named graphs is the current *source graph*
113
-
* an optional information that indicates which of these named graphs is the current *target graph*
119
+
* a special graph drawn from the execution context that is called the *default source graph*
120
+
* a special graph drawn from the execution context that is called the *default target graph*
121
+
* an optional information that indicates which of these named graphs if any is the *returned source graph*
122
+
* an optional information that indicates which of these named graphs if any is the *returned target graph*
114
123
* optional *tabular data*, i.e. a potentially ordered bag of records, each having the same fixed set of fields
115
124
116
125
These redefinitions constitute the multiple graphs execution model. A parameterized Cypher query under this model can _also_ be described as executing within (and operating on) a given execution context and an initial query context and finally returning the query context produced as output for the top-most `RETURN` clause.
@@ -140,7 +149,7 @@ A query `Q1` whose output signature is an acceptable (in terms of provided bindi
140
149
141
150
This homogenous query composition is enabled by using an uniform query context that is passed between clauses.
142
151
143
-
Note: The currently drafted subquery CIP proposes a language addition (e.g. `THEN`) for expressing this kind of query composition directly.
152
+
Note: The currently drafted subquery CIP proposes a language addition (e.g. `THEN`) for expressing this kind of query composition directly. In terms of this CIP, `THEN` is simply syntactic sugar for `WITH * GRAPHS *`
144
153
145
154
=== Query combinators
146
155
@@ -188,27 +197,43 @@ This CIP proposes the following kinds of graph specifiers:
188
197
189
198
* `NEW GRAPH [<new-graph-name>] [AT <graph-url>]`: Reference to a newly created, empty graph that is to be bound as `<new-graph-name>` and may potentially overwrite any pre-existing graph at the provided `<graph-url>`
190
199
* `GRAPH [<new-graph-name] AT <graph-url>`: Reference to the graph at the given `<graph-url>` that is to be bound as `<new-graph-name>`
191
-
* `GRAPH <graph-name> [AS <new-graph-name>]`: Reference to an already bound named graph
192
-
* `SOURCE GRAPH [AS <new-graph-name>]`: Reference to the currently _provided source graph_, optionally to be bound as `<new-graph-name>`
193
-
* `TARGET GRAPH [AS <new-graph-name>]`: Reference to the currently _provided target graph_, optionally to be bound as `<new-graph-name>`
200
+
* `[GRAPH] <graph-name> [AS <new-graph-name>]`: Reference to an already bound named graph
201
+
* `COPY [GRAPH] <graph-name> [AS <new-graph-name>]`: Reference to a copy of an already bound named graph
202
+
* `SOURCE GRAPH [<new-graph-name>]`: Reference to the currently _provided source graph_, optionally to be bound as `<new-graph-name>`
203
+
* `TARGET GRAPH [<new-graph-name>]`: Reference to the currently _provided target graph_, optionally to be bound as `<new-graph-name>`
194
204
195
205
If a graph specifier is not referencing an already bound named graph and does not specify a `<new-graph-name>`, it is bound to a fresh system generated name.
196
206
The details of this are left to implementations.
197
207
198
208
It is an error to use a `<graph-specifier>` in a context where it's introduced `<new-graph-name>` is already bound.
199
209
200
-
=== Changing back to the default graph
210
+
==== Graph names
211
+
212
+
Graph names use the same syntax as existing variable names.
213
+
214
+
It is an error to use the same name for both a regular variable or the name of a graph.
201
215
202
-
Additionally, this CIP proposes new syntax for changing the source and the target graph of the current query back to the the default graph provided by the outer execution context:
216
+
==== Graph URLs
217
+
218
+
The exact shape and form of graph URL lies outside the scope of this CIP.
219
+
220
+
This CIP however proposes that a `<graph-url>` must always be given as either a string literal or a query parameter.
221
+
222
+
This allows parameterization of queries by controlling which graphs from which graph URLs they should use.
223
+
224
+
=== Changing back to no graph
225
+
226
+
Additionally, this CIP proposes new syntax for discarding the source and the target graph of the current query:
203
227
204
228
[source, cypher]
205
229
----
230
+
FROM -
231
+
INTO -
206
232
----
207
233
208
-
`DEFAULT GRAPH` is not a graph specifier; rather this syntax is a special form for discarding the current source and target graph such that the provided source and target graph are again chosen to be the default graph as specified for partial query contexts.
209
-
210
-
In consequence, both `FROM DEFAULT GRAPH` and `INTO DEFAULT GRAPH` without an explicitly given `<new-graph-name>` will not bind the default graph to a generated fresh name.
234
+
`-` is not a graph specifier; rather this syntax is a special form for discarding the current source and target graph such that the provided source and target graph are again chosen to be the default graph as specified for partial query contexts.
211
235
236
+
In consequence, both `FROM -` and `INTO -` will not bind the default graph to a generated fresh name.
212
237
This is different from `<graph-specifier>` semantics that will ensure that referenced graphs are always bound to a name.
213
238
214
239
=== Returning, aliasing, and selecting graphs
@@ -218,33 +243,35 @@ The newly proposed syntax is:
This CIP proposes the following kinds of `<graph-return-items>`:
226
251
227
-
* `<graph-item-list`: A comma separated list of `<graph-return-item>` (defined below) that are to be passed on
252
+
* `<graph-specifier-list>`: A comma separated list of `<graph-specifier>` that are to be passed on
228
253
* `*`: All named graphs are to be passed on
229
-
* `*, <graph-item-list>`: All named graphs are to be passed on together with any additional named graphs that are newly bound in `<graph-item-list>`
254
+
* `*, <graph-specifier-list>`: All named graphs are to be passed on together with any additional named graphs that are newly bound in `<graph-specifier-list>`
230
255
* `-`: No named graphs are to be passed on
231
256
232
-
The order of named graphs inherently given by `<graph-return-items` is semantically insignificant.
257
+
The order of named graphs inherently given by `<graph-return-items>` is semantically insignificant.
233
258
However it is recommended that conforming implementations preserve this order at least in programmatic output operations (e.g. a textual display of the list of returned graphs).
234
259
This in essence mirrors the semantics for tabular data returned by Cypher.
235
260
236
-
This CIP proposes the introduction of the following kinds of graph return items that may be included in a `<graph-item-list>`:
261
+
Both `WITH ... GRAPHS ...` and `RETURN ... GRAPHS ...` will pass on (or return respectively) exactly the set of described named graphs.
262
+
To simplify passing on available graphs it is proposed by this CIP that regular `WITH <return-items>` is taken to be syntactic sugar for `WITH <return-items> GRAPHS -` and that regular `RETURN <return-items>` is taken to be syntactic sugar for `RETURN <return-items> GRAPHS -`.
237
263
238
-
* `<graph-specifier>`: Any graph that is described by a `<graph-specifier>` may be passed on under the provided `<new-graph-name>` (unless the given graph is an un-aliased already existing graph, it which case it's passed on with it's existing name)
To even further simplify, it is additionally proposed that `WITH|RETURN <return-items> INPUT GRAPHS <graph-return-items>` is to be syntactic sugar for `WITH|RETURN <return-items> GRAPHS <graph-return-items>, SOURCE GRAPH, TARGET GRAPH`.
265
+
However if `<graph-return-items>` already passes on a reference for the `SOURCE GRAPH`, no additional reference for it is added and if `<graph-return-items>` already passes on a reference for the `TARGET GRAPH`, no additional reference for it is added.
240
266
241
-
Both `WITH` and `RETURN` will pass on (or return respectively) exactly the set of described named graphs.
242
267
If the current named source graph (or the current named target graph) are not passed on, they are discarded and due to the rules regarding partial query contexts the provided source graph (or target respectively) again are chosen to be the default graph of the outer execution context.
243
268
269
+
Note: `WITH <return-items> GRAPHS *` may be used to pass through the initial query context without having to alias source and target graphs explicitly.
270
+
244
271
=== Discarding available tabular data
245
272
246
-
It is additionally proposed that both `WITH GRAPHS <graph-return-items>` and `RETURN GRAPHS <graph-return-items>` are
247
-
special forms for discarding all tabular data such that the provided tabular input for the following clause (or query respectively) would again be the provided single record without any fields as specified by the rules for partial query contexts.
273
+
It is additionally proposed that both `WITH GRAPHS <graph-return-items>` and `RETURN GRAPHS <graph-return-items>` are syntactic sugar for `WITH - GRAPHS <graph-return-items>` (and `RETURN - GRAPHS <graph-return-items>` respectively).
274
+
These special forms may be used for discarding all tabular data such that the provided tabular input for the following clause (or query respectively) would again be the provided single record without any fields as specified by the rules for partial query contexts.
248
275
249
276
Note: This syntax may be used to indicate when the gradual construction of a named graph is finished since neither fields nor the cardinality of tabular data is preserved after this point.
A `<graph-construction-subquery>` is an updating subquery (i.e. a sequence of clauses, including update clauses) that may or may not end in `RETURN`.
267
294
All variables bound before the nested `FROM` and `INTO` subqueries are made visible to the `<graph-construction-subquery>`.
268
295
All variables bound at the end of the `<graph-construction-subquery>` are made visible to the remaining outer query.
269
296
270
-
These forms have the exact same effect as creating aliases for the current source and target graph, then changing the current source and target graph as specified before executing the given `<graph-construction-subquery>`, and finally restoring the original source and target graphs using the aliases followed by discarding those aliases from the current scope.
297
+
These forms have the exact same effect as creating fresh aliases for the current source and target graph, then changing the current source and target graph as specified before executing the given `<graph-construction-subquery>`, and finally restoring the original source and target graphs using the aliases followed by discarding those aliases from the current scope.
298
+
299
+
=== Updating graphs
300
+
301
+
This CIP proposes the following update semantics for Cypher with support for multiple graphs.
302
+
303
+
Entities are always created in and deleted from the currently provided target graph.
304
+
305
+
Semantically, all effects of an updating clause must be made visible before proceeding with the execution of the next clause.
306
+
In other words, a conforming implementation must ensure that a later clause alway sees the complete set of updates of a preceding updating clause.
307
+
308
+
A single update clause may perform multiple conflicting updates on the same node or relationship.
309
+
In this situation, the outcome is undefined.
310
+
311
+
Conflicting updates are considered to be out of scope of this CIP.
312
+
313
+
For now it is proposed that a conforming implementation must choose at least either the original value or one of the values written or `NULL` as the final outcome of a conflicting update.
271
314
272
315
=== Query signature declarations
273
316
274
-
Finally this CIP proposed using the `WITH` clause as the initial clause in a query for declaring all query input arguments:
317
+
Finally this CIP proposed using the `WITH` clause as the initial clause in a query for declaring all query inputs:
It is proposed that using `WITH` as the initial clause here is to be called a *query input declaration* while the use of `RETURN` as the last clause is to be called a *query output declaration* henceforth.
325
+
It is proposed that using `WITH` as the initial clause in a query is to be called a *query input declaration* while the use of `RETURN` as the last clause is to be called a *query output declaration*.
282
326
283
327
Query input declarations are subject to the following limitations:
284
328
285
-
* All return items are expected to be over an imagined set of input variables from the previous query
286
-
* All such referenced variables must be declared or aliased explicitly by another return item
287
-
* The use of `WITH *` and `WITH *, ...` causes all undeclared incoming variables to be renamed to fresh system generated variable names
288
-
* The use of `GRAPH *` and `GRAPH *, ...` causes all incoming graphs to be renamed to fresh system generated graph names
329
+
* All return item expressions are expected to reference an imagined set of input variables from the previous query
330
+
* All such referenced variables must be declared or aliased explicitly by another return item unless the query input declaration starts with `WITH *` or `WITH *,`
331
+
* If the input query context provides additional, undeclared variables or graphs, those inputs are to be silently discarded by query composition or execution
289
332
290
-
If the input query context provides additional variables or graphs, those inputs are to be silently discarded by query composition or execution.
333
+
A query that does not start with a query input declaration is assumed to start with `WITH - GRAPHS -`, i.e. to run in isolation and to initially read and write to the default graph.
334
+
335
+
== Grammar
336
+
337
+
Proposed syntax changes
338
+
[source, ebnf]
339
+
----
340
+
// TODO
341
+
----
291
342
292
343
== Examples
293
344
@@ -327,7 +378,7 @@ INTO NEW GRAPH berlin
327
378
CREATE (a)-[:FRIEND]->(b) WHERE c.name = "Berlin"
328
379
INTO NEW GRAPH santiago
329
380
CREATE (a)-[:FRIEND]->(b) WHERE c.name = "Santiago"
330
-
FROM DEFAULT GRAPH
381
+
FROM -
331
382
RETURN c.name AS city, count(r) AS num_friends GRAPHS berlin, santiago
0 commit comments