Skip to content

Commit 2498907

Browse files
committed
Minor changes and clarifications
1 parent ce09cf5 commit 2498907

File tree

1 file changed

+102
-51
lines changed

1 file changed

+102
-51
lines changed

cip/CIP2017-06-18-multiple-graphs.adoc

+102-51
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,7 @@ This CIP proposes to extend Cypher with support for the construction, transforma
1515
toc::[]
1616

1717
```
18-
TODO:
19-
20-
* Parameter handling
21-
* Graph name syntax
22-
* Precise update semantics
23-
* Entity identity
18+
* Composition Semantics
2419
```
2520

2621
== Motivation
@@ -68,12 +63,24 @@ An entity is considered to be deleted if it is no longer part of any graph.
6863

6964
=== Graph Addressing
7065

71-
Graphs do not expose an identity like nodes or relationships. They may however be made addressable through other means by a conforming implementation (e.g. through exposing the graph under a _Graph URL_).
66+
Graphs do not expose an identity like nodes or relationships do.
67+
68+
Graphs may be made addressable through other means by a conforming implementation (e.g. through exposing the graph under a _graph URL_ for referencing and loading it).
69+
The details regarding the format and choice of graph URLs is outside the scope of this proposal.
70+
71+
A graph is considered to have been deleted if it is no longer registered under a graph URL and no other reference to it is retained (e.g. from a running query).
72+
73+
=== Entity Identity
7274

73-
The details of such a mechanism are out of scope of this proposal.
75+
In the single property graph model, nodes and relationships are commonly identified by a single integer id.
76+
This model was originally not designed for sharing entities between many different graphs while ensuring that entity ids are unique.
7477

75-
However, a graph is considered to have been deleted if it is no longer registered under a Graph URL and no other reference to it is retained (e.g. from a running query).
78+
In the multiple property graphs model, entities are additionally implicitly associated with a _graph space_ that allows to distinguish between entities with the same original id from different sources (e.g. different databases or even snapshots of the same database).
7679

80+
In the multiple property graphs model, no graph may contain two entities from the same graph space that have the same original id.
81+
82+
Graph spaces may be made identifiable by a conforming implementation by assigning a _graph URI_ to them.
83+
The details regarding the format and choice of graph URIs is outside the scope of this proposal.
7784

7885
== Background: Single Graph Execution Model
7986

@@ -109,8 +116,10 @@ This CIP proposes to redefine the *execution context* to be
109116
This CIP proposes to redefine the *query context* to be
110117

111118
* a set of named graphs from the *execution context*
112-
* an optional information that indicates which of these named graphs is the current *source graph*
113-
* an optional information that indicates which of these named graphs is the current *target graph*
119+
* a special graph drawn from the execution context that is called the *default source graph*
120+
* a special graph drawn from the execution context that is called the *default target graph*
121+
* an optional information that indicates which of these named graphs if any is the *returned source graph*
122+
* an optional information that indicates which of these named graphs if any is the *returned target graph*
114123
* optional *tabular data*, i.e. a potentially ordered bag of records, each having the same fixed set of fields
115124

116125
These redefinitions constitute the multiple graphs execution model. A parameterized Cypher query under this model can _also_ be described as executing within (and operating on) a given execution context and an initial query context and finally returning the query context produced as output for the top-most `RETURN` clause.
@@ -140,7 +149,7 @@ A query `Q1` whose output signature is an acceptable (in terms of provided bindi
140149

141150
This homogenous query composition is enabled by using an uniform query context that is passed between clauses.
142151

143-
Note: The currently drafted subquery CIP proposes a language addition (e.g. `THEN`) for expressing this kind of query composition directly.
152+
Note: The currently drafted subquery CIP proposes a language addition (e.g. `THEN`) for expressing this kind of query composition directly. In terms of this CIP, `THEN` is simply syntactic sugar for `WITH * GRAPHS *`
144153

145154
=== Query combinators
146155

@@ -188,27 +197,43 @@ This CIP proposes the following kinds of graph specifiers:
188197

189198
* `NEW GRAPH [<new-graph-name>] [AT <graph-url>]`: Reference to a newly created, empty graph that is to be bound as `<new-graph-name>` and may potentially overwrite any pre-existing graph at the provided `<graph-url>`
190199
* `GRAPH [<new-graph-name] AT <graph-url>`: Reference to the graph at the given `<graph-url>` that is to be bound as `<new-graph-name>`
191-
* `GRAPH <graph-name> [AS <new-graph-name>]`: Reference to an already bound named graph
192-
* `SOURCE GRAPH [AS <new-graph-name>]`: Reference to the currently _provided source graph_, optionally to be bound as `<new-graph-name>`
193-
* `TARGET GRAPH [AS <new-graph-name>]`: Reference to the currently _provided target graph_, optionally to be bound as `<new-graph-name>`
200+
* `[GRAPH] <graph-name> [AS <new-graph-name>]`: Reference to an already bound named graph
201+
* `COPY [GRAPH] <graph-name> [AS <new-graph-name>]`: Reference to a copy of an already bound named graph
202+
* `SOURCE GRAPH [<new-graph-name>]`: Reference to the currently _provided source graph_, optionally to be bound as `<new-graph-name>`
203+
* `TARGET GRAPH [<new-graph-name>]`: Reference to the currently _provided target graph_, optionally to be bound as `<new-graph-name>`
194204

195205
If a graph specifier is not referencing an already bound named graph and does not specify a `<new-graph-name>`, it is bound to a fresh system generated name.
196206
The details of this are left to implementations.
197207

198208
It is an error to use a `<graph-specifier>` in a context where it's introduced `<new-graph-name>` is already bound.
199209

200-
=== Changing back to the default graph
210+
==== Graph names
211+
212+
Graph names use the same syntax as existing variable names.
213+
214+
It is an error to use the same name for both a regular variable or the name of a graph.
201215

202-
Additionally, this CIP proposes new syntax for changing the source and the target graph of the current query back to the the default graph provided by the outer execution context:
216+
==== Graph URLs
217+
218+
The exact shape and form of graph URL lies outside the scope of this CIP.
219+
220+
This CIP however proposes that a `<graph-url>` must always be given as either a string literal or a query parameter.
221+
222+
This allows parameterization of queries by controlling which graphs from which graph URLs they should use.
223+
224+
=== Changing back to no graph
225+
226+
Additionally, this CIP proposes new syntax for discarding the source and the target graph of the current query:
203227

204228
[source, cypher]
205229
----
230+
FROM -
231+
INTO -
206232
----
207233

208-
`DEFAULT GRAPH` is not a graph specifier; rather this syntax is a special form for discarding the current source and target graph such that the provided source and target graph are again chosen to be the default graph as specified for partial query contexts.
209-
210-
In consequence, both `FROM DEFAULT GRAPH` and `INTO DEFAULT GRAPH` without an explicitly given `<new-graph-name>` will not bind the default graph to a generated fresh name.
234+
`-` is not a graph specifier; rather this syntax is a special form for discarding the current source and target graph such that the provided source and target graph are again chosen to be the default graph as specified for partial query contexts.
211235

236+
In consequence, both `FROM -` and `INTO -` will not bind the default graph to a generated fresh name.
212237
This is different from `<graph-specifier>` semantics that will ensure that referenced graphs are always bound to a name.
213238

214239
=== Returning, aliasing, and selecting graphs
@@ -218,33 +243,35 @@ The newly proposed syntax is:
218243

219244
[source, cypher]
220245
----
221-
WITH [ < return-items > ] [ GRAPHS < graph-return-items > ]
222-
RETURN [ < return-items > ] [ GRAPHS < graph-return-items > ]
246+
WITH [ < return-items > ] [ [ INPUT ] GRAPHS < graph-return-items > ]
247+
RETURN [ < return-items > ] [ [ INPUT ] GRAPHS < graph-return-items > ]
223248
----
224249

225250
This CIP proposes the following kinds of `<graph-return-items>`:
226251

227-
* `<graph-item-list`: A comma separated list of `<graph-return-item>` (defined below) that are to be passed on
252+
* `<graph-specifier-list>`: A comma separated list of `<graph-specifier>` that are to be passed on
228253
* `*`: All named graphs are to be passed on
229-
* `*, <graph-item-list>`: All named graphs are to be passed on together with any additional named graphs that are newly bound in `<graph-item-list>`
254+
* `*, <graph-specifier-list>`: All named graphs are to be passed on together with any additional named graphs that are newly bound in `<graph-specifier-list>`
230255
* `-`: No named graphs are to be passed on
231256

232-
The order of named graphs inherently given by `<graph-return-items` is semantically insignificant.
257+
The order of named graphs inherently given by `<graph-return-items>` is semantically insignificant.
233258
However it is recommended that conforming implementations preserve this order at least in programmatic output operations (e.g. a textual display of the list of returned graphs).
234259
This in essence mirrors the semantics for tabular data returned by Cypher.
235260

236-
This CIP proposes the introduction of the following kinds of graph return items that may be included in a `<graph-item-list>`:
261+
Both `WITH ... GRAPHS ...` and `RETURN ... GRAPHS ...` will pass on (or return respectively) exactly the set of described named graphs.
262+
To simplify passing on available graphs it is proposed by this CIP that regular `WITH <return-items>` is taken to be syntactic sugar for `WITH <return-items> GRAPHS -` and that regular `RETURN <return-items>` is taken to be syntactic sugar for `RETURN <return-items> GRAPHS -`.
237263

238-
* `<graph-specifier>`: Any graph that is described by a `<graph-specifier>` may be passed on under the provided `<new-graph-name>` (unless the given graph is an un-aliased already existing graph, it which case it's passed on with it's existing name)
239-
* `<graph-name> [AS <new-graph-name>], ...`: Syntactic sugar for `GRAPH <graph-name> [AS <new-graph-name>]`
264+
To even further simplify, it is additionally proposed that `WITH|RETURN <return-items> INPUT GRAPHS <graph-return-items>` is to be syntactic sugar for `WITH|RETURN <return-items> GRAPHS <graph-return-items>, SOURCE GRAPH, TARGET GRAPH`.
265+
However if `<graph-return-items>` already passes on a reference for the `SOURCE GRAPH`, no additional reference for it is added and if `<graph-return-items>` already passes on a reference for the `TARGET GRAPH`, no additional reference for it is added.
240266

241-
Both `WITH` and `RETURN` will pass on (or return respectively) exactly the set of described named graphs.
242267
If the current named source graph (or the current named target graph) are not passed on, they are discarded and due to the rules regarding partial query contexts the provided source graph (or target respectively) again are chosen to be the default graph of the outer execution context.
243268

269+
Note: `WITH <return-items> GRAPHS *` may be used to pass through the initial query context without having to alias source and target graphs explicitly.
270+
244271
=== Discarding available tabular data
245272

246-
It is additionally proposed that both `WITH GRAPHS <graph-return-items>` and `RETURN GRAPHS <graph-return-items>` are
247-
special forms for discarding all tabular data such that the provided tabular input for the following clause (or query respectively) would again be the provided single record without any fields as specified by the rules for partial query contexts.
273+
It is additionally proposed that both `WITH GRAPHS <graph-return-items>` and `RETURN GRAPHS <graph-return-items>` are syntactic sugar for `WITH - GRAPHS <graph-return-items>` (and `RETURN - GRAPHS <graph-return-items>` respectively).
274+
These special forms may be used for discarding all tabular data such that the provided tabular input for the following clause (or query respectively) would again be the provided single record without any fields as specified by the rules for partial query contexts.
248275

249276
Note: This syntax may be used to indicate when the gradual construction of a named graph is finished since neither fields nor the cardinality of tabular data is preserved after this point.
250277

@@ -259,35 +286,59 @@ The proposed syntax is:
259286

260287
[source, cypher]
261288
----
262-
FROM < graph-specifier > | DEFAULT GRAPH [AS < new-graph-name >] { < graph-construction-subquery > }
263-
INTO < graph-specifier > | DEFAULT GRAPH [AS < new-graph-name >] { < graph-construction-subquery > }
289+
FROM < graph-specifier > | '-' { < graph-construction-subquery > }
290+
INTO < graph-specifier > | '-' { < graph-construction-subquery > }
264291
----
265292

266293
A `<graph-construction-subquery>` is an updating subquery (i.e. a sequence of clauses, including update clauses) that may or may not end in `RETURN`.
267294
All variables bound before the nested `FROM` and `INTO` subqueries are made visible to the `<graph-construction-subquery>`.
268295
All variables bound at the end of the `<graph-construction-subquery>` are made visible to the remaining outer query.
269296

270-
These forms have the exact same effect as creating aliases for the current source and target graph, then changing the current source and target graph as specified before executing the given `<graph-construction-subquery>`, and finally restoring the original source and target graphs using the aliases followed by discarding those aliases from the current scope.
297+
These forms have the exact same effect as creating fresh aliases for the current source and target graph, then changing the current source and target graph as specified before executing the given `<graph-construction-subquery>`, and finally restoring the original source and target graphs using the aliases followed by discarding those aliases from the current scope.
298+
299+
=== Updating graphs
300+
301+
This CIP proposes the following update semantics for Cypher with support for multiple graphs.
302+
303+
Entities are always created in and deleted from the currently provided target graph.
304+
305+
Semantically, all effects of an updating clause must be made visible before proceeding with the execution of the next clause.
306+
In other words, a conforming implementation must ensure that a later clause alway sees the complete set of updates of a preceding updating clause.
307+
308+
A single update clause may perform multiple conflicting updates on the same node or relationship.
309+
In this situation, the outcome is undefined.
310+
311+
Conflicting updates are considered to be out of scope of this CIP.
312+
313+
For now it is proposed that a conforming implementation must choose at least either the original value or one of the values written or `NULL` as the final outcome of a conflicting update.
271314

272315
=== Query signature declarations
273316

274-
Finally this CIP proposed using the `WITH` clause as the initial clause in a query for declaring all query input arguments:
317+
Finally this CIP proposed using the `WITH` clause as the initial clause in a query for declaring all query inputs:
275318

276319
[source, cypher]
277320
----
278-
WITH [ < return-items > ] [ GRAPHS < graph-return-items > ]
321+
WITH < return-items > [ [ INPUT ] GRAPHS < graph-return-items > ]
322+
WITH [ < return-items > ] [ INPUT ] GRAPHS < graph-return-items >
279323
----
280324

281-
It is proposed that using `WITH` as the initial clause here is to be called a *query input declaration* while the use of `RETURN` as the last clause is to be called a *query output declaration* henceforth.
325+
It is proposed that using `WITH` as the initial clause in a query is to be called a *query input declaration* while the use of `RETURN` as the last clause is to be called a *query output declaration*.
282326

283327
Query input declarations are subject to the following limitations:
284328

285-
* All return items are expected to be over an imagined set of input variables from the previous query
286-
* All such referenced variables must be declared or aliased explicitly by another return item
287-
* The use of `WITH *` and `WITH *, ...` causes all undeclared incoming variables to be renamed to fresh system generated variable names
288-
* The use of `GRAPH *` and `GRAPH *, ...` causes all incoming graphs to be renamed to fresh system generated graph names
329+
* All return item expressions are expected to reference an imagined set of input variables from the previous query
330+
* All such referenced variables must be declared or aliased explicitly by another return item unless the query input declaration starts with `WITH *` or `WITH *,`
331+
* If the input query context provides additional, undeclared variables or graphs, those inputs are to be silently discarded by query composition or execution
289332

290-
If the input query context provides additional variables or graphs, those inputs are to be silently discarded by query composition or execution.
333+
A query that does not start with a query input declaration is assumed to start with `WITH - GRAPHS -`, i.e. to run in isolation and to initially read and write to the default graph.
334+
335+
== Grammar
336+
337+
Proposed syntax changes
338+
[source, ebnf]
339+
----
340+
// TODO
341+
----
291342

292343
== Examples
293344

@@ -327,7 +378,7 @@ INTO NEW GRAPH berlin
327378
CREATE (a)-[:FRIEND]->(b) WHERE c.name = "Berlin"
328379
INTO NEW GRAPH santiago
329380
CREATE (a)-[:FRIEND]->(b) WHERE c.name = "Santiago"
330-
FROM DEFAULT GRAPH
381+
FROM -
331382
RETURN c.name AS city, count(r) AS num_friends GRAPHS berlin, santiago
332383
----
333384

@@ -347,7 +398,7 @@ CREATE (a)-[:POSSIBLE_FRIEND]->(c)
347398
WITH GRAPHS *
348399
349400
// Switch context to named graph.
350-
FROM GRAPH recommendations
401+
FROM recommendations
351402
MATCH (a:Person)-[e:POSSIBLE_FRIEND]->(b:Person)
352403
// Return tabular and graph output
353404
RETURN a.name, b.name, count(e) AS cnt
@@ -374,12 +425,12 @@ SET a.country = cn.name
374425
// ... and finally discard all tabular data and cardinality
375426
WITH GRAPHS *
376427
377-
FROM GRAPH sn_updated
428+
FROM sn_updated
378429
MATCH (a:Person)-[e:KNOWS]->(b:Person)
379430
WITH a.country AS a_country, b.country AS b_country, count(a) AS a_cnt, count(b) AS b_cnt, count(e) AS e_cnt
380431
INTO NEW GRAPH rollup {
381432
MERGE (:Persons {country: a_country, cnt: a_cnt})-[:KNOW {cnt: e_cnt}]->(:Persons {country: b_country, cnt: b_cnt})
382-
}
433+
}
383434
// Return final graph output
384435
RETURN GRAPHS rollup
385436
----
@@ -394,29 +445,29 @@ MATCH (a:Person)-[e]->(b:Person),
394445
(a)-[:LIVES_IN]->()->[:IS_LOCATED_IN]-(c:Country {name: ‘Sweden’}),
395446
(b)-[:LIVES_IN]->()->[:IS_LOCATED_IN]-(c)
396447
// Create a persistent graph at 'graph://social-network/swe'
397-
INTO GRAPH sweden_people AT './swe' {
448+
INTO NEW GRAPH sweden_people AT './swe' {
398449
// connecting persons that live in the same city in Sweden.
399450
CREATE (a)-[e]->(b)
400-
}
451+
}
401452
// Finally discard all tabular data and cardinality
402453
WITH GRAPHS *
403454
404455
MATCH (a:Person)-[e]->(b:Person),
405456
(a)-[:LIVES_IN]->()->[:IS_LOCATED_IN]-(c:Country {name: ‘Germany’}),
406457
(b)-[:LIVES_IN]->()->[:IS_LOCATED_IN]-(c)
407458
// Create a persistent graph at 'graph://social-network/ger'
408-
INTO GRAPH german_people AT './ger' {
459+
INTO NEW GRAPH german_people AT './ger' {
409460
// connecting persons that live in the same city in Germany.
410461
CREATE (a)-[e]->(b)
411462
}
412463
// Finally discard all tabular data and cardinality
413464
WITH GRAPHS *
414465
415466
// Start query on the 'sweden_people' graph
416-
FROM GRAPH sweden_people
467+
FROM sweden_people
417468
MATCH p=(a)--(b)--(c)--(a) WHERE NOT (a)--(c)
418469
// Create a temporary graph 'swedish_triangles'
419-
INTO GRAPH swedish_triangles {
470+
INTO NEW GRAPH swedish_triangles {
420471
ADD p
421472
}
422473
// and return it together with a count of it's content

0 commit comments

Comments
 (0)