You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: cip/1.accepted/CIP2017-06-18-multiple-graphs.adoc
+212-1
Original file line number
Diff line number
Diff line change
@@ -349,6 +349,9 @@ Proposed syntax changes
349
349
350
350
== Examples
351
351
352
+
The following examples are intended to show how multiple graphs may be used, and focus on syntax.
353
+
We show a fully worked-through example <<complete-example, here>>, describing and illustrating every step of the pipeline in detail.
354
+
352
355
=== A template for a multiple graph pipeline
353
356
[source, cypher]
354
357
----
@@ -446,7 +449,7 @@ INTO NEW GRAPH rollup {
446
449
RETURN GRAPHS rollup
447
450
----
448
451
449
-
=== A more complex pipeline: using and materializing multiple graphs
452
+
=== A more complex pipeline: using and persisting multiple graphs
450
453
451
454
[source, cypher]
452
455
----
@@ -486,6 +489,214 @@ INTO NEW GRAPH swedish_triangles {
486
489
RETURN count(p) AS num_triangles GRAPHS swedish_triangles, sweden_people, german_people
487
490
----
488
491
492
+
[[complete-example]]
493
+
=== A complete example illustrating a data integration scenario
494
+
495
+
Assume we have two graphs, *ActorsFilmsCities* and *Events*, each of which is contained in a separate location.
496
+
This example will show how these two graphs can be integrated into a single graph.
497
+
498
+
The *ActorsFilmsCities* graph models actors and people fulfilling other roles in the film-industry; films in which they acted, or directed, or for which they wrote the soundtrack; cities in which they were born; and their relationships to family members and colleagues.
499
+
500
+
Each node is labelled and contains one or two properties (where `YOB` stands for 'year of birth'), and each relationship of type `ACTED_IN` has a `charactername` property indicating the name of the character the relevant `Actor` played in the `Film`.
The other graph, *Events*, models information on events.
505
+
Each event is linked to an event type by an `IS_A` relationship, to a year by an `IN_YEAR` relationship, and to a city by an `IN_CITY` relationship.
506
+
For example, the _Battle of Britain_ event is classified as a _War Event_, occurred in the year _1940_, and took place in _London_.
507
+
508
+
In contrast to the *ActorsFilmsCities* graph, *Events* contains no labels on any node, no properties on any relationship, and only a single `value` property on each node.
509
+
*Events* can be considered to be a snapshot of data from an RDF graph, in the sense that every node has one and only one value; i.e. in contrast to a property graph, an RDF graph has properties on neither nodes nor relationships.
510
+
(For easier visibility, we have coloured accordingly the cities and city-related relationships, event types and event-type relationships, and year and year-related relationships.)
511
+
512
+
image::opencypher-Events-graph.jpg[Graph,800,800]
513
+
514
+
The aims of the data integration exercise are twofold:
515
+
516
+
* Create and persist to disk (for future use) a new graph, *PersonCityEvents*, containing an amalgamation of data from *ActorsFilmsCities* and *Events*.
517
+
*PersonCityEvents* must contain all the event information from *Events*, and only `Person` nodes connected to `City` nodes from *ActorsFilmsCities*.
518
+
519
+
* Create and return a temporary graph, *Temp-PersonCityCrimes*.
520
+
*Temp-PersonCityCrimes* must contain a subset of the data from *PersonCityEvents*, consisting only of the criminal events, their associated `City` nodes, and `Person` nodes associated with the `City` nodes.
521
+
522
+
==== Step 1:
523
+
524
+
The first action to take in our data integration exercise is to set the source graph to *ActorsFilmsCities*, for which we need to provide the physical address:
525
+
526
+
[source, cypher]
527
+
----
528
+
FROM GRAPH ActorsFilmsCities AT 'graph://actors_films_cities...'
529
+
----
530
+
531
+
Next, match all `Person` nodes who have a `BORN_IN` relationship to a `City`:
532
+
533
+
[source, cypher]
534
+
----
535
+
MATCH (p:Person)-[:BORN_IN]->(c:City)
536
+
----
537
+
538
+
Create the new graph *PersonCityEvents*, persist it to _some-location_, and set it as the target graph:
539
+
540
+
[source, cypher]
541
+
----
542
+
INTO NEW GRAPH PersonCityEvents AT 'some-location'
543
+
----
544
+
545
+
Write the subgraph induced by the `MATCH` clause above into *PersonCityEvents*:
546
+
547
+
[source, cypher]
548
+
----
549
+
CREATE XXXX TODO
550
+
----
551
+
552
+
Putting all these statements together, we get:
553
+
554
+
_Query sequence for Step 1_:
555
+
[source, cypher]
556
+
----
557
+
FROM GRAPH ActorsFilmsCities AT 'graph://actors_films_cities...'
558
+
MATCH (p:Person)-[:BORN_IN]->(c:City)
559
+
INTO NEW GRAPH PersonCityEvents AT 'some-location' {
The next stage in the pipeline is to add the events information from *Events* to *PersonCityEvents*.
573
+
574
+
Firstly, the source graph is set to *Events*, for which we need to provide the physical address:
575
+
576
+
[source, cypher]
577
+
----
578
+
FROM GRAPH Events AT 'graph://events...'
579
+
----
580
+
581
+
At this point, the *Events* graph is in scope.
582
+
583
+
All the events information -- the event itself, its type, the year in which it occurred, and the city in which it took place -- is matched:
584
+
585
+
[source, cypher]
586
+
----
587
+
MATCH (c)<-[:IN_CITY]-(e)-[:IN_YEAR]->(y),
588
+
(e)-[:IS_A]->(et)
589
+
----
590
+
591
+
The target graph is set to the *PersonCityEvents* graph (created earlier):
592
+
593
+
[source, cypher]
594
+
----
595
+
INTO GRAPH PersonCityEvents
596
+
----
597
+
598
+
Using the results from the `MATCH` clause, create a subgraph with more intelligible semantics through the transformation of the events information into a less verbose form through greater use of node-level properties.
599
+
Write the subgraph to *PersonCityEvents*.
600
+
601
+
[source, cypher]
602
+
----
603
+
CREATE XXXX TODO
604
+
----
605
+
606
+
Putting all these statements together, we get:
607
+
608
+
_Query sequence for Step 2_:
609
+
[source, cypher]
610
+
----
611
+
FROM GRAPH Events AT 'graph://events...'
612
+
MATCH (c)<-[:IN_CITY]-(e)-[:IN_YEAR]->(y),
613
+
(e)-[:IS_A]->(et)
614
+
INTO GRAPH PersonCityEvents {
615
+
CREATE XXX TODO
616
+
}
617
+
//Discard all tabular data and cardinality
618
+
WITH GRAPHS *
619
+
----
620
+
621
+
*PersonCityEvents* now contains the following data:
The last step in the data integration pipeline is the creation of a new, temporary graph, *Temp-PersonCityCrimes*, which is to be populated with the subgraph of all the criminal events and associated nodes from *PersonCityEvents*.
628
+
629
+
Set *PersonCityEvents* to be in scope:
630
+
631
+
[source, cypher]
632
+
----
633
+
FROM GRAPH PersonCityEvents
634
+
----
635
+
636
+
Next, obtain the subgraph of all criminal events -- i.e. nodes labelled with `CriminalEvent` -- and their associated `City` nodes, and `Person` nodes associated with the `City` nodes:
637
+
638
+
[source, cypher]
639
+
----
640
+
MATCH (ce:CriminalEvent)-[:HAPPENED_IN]->(c:City)<-[:BORN_IN]-(p:Person)
641
+
----
642
+
643
+
Create the new, temporary graph *Temp-PersonCityCrimes*, and set it as the target graph:
644
+
645
+
[source, cypher]
646
+
----
647
+
INTO NEW GRAPH Temp-PersonCityCrimes
648
+
----
649
+
650
+
Write the subgraph acquired earlier to *Temp-PersonCityCrimes*.
651
+
652
+
[source, cypher]
653
+
----
654
+
CREATE XXXX TODO
655
+
----
656
+
657
+
Putting all these statements together, we get:
658
+
659
+
_Query sequence for Step 3_:
660
+
[source, cypher]
661
+
----
662
+
FROM GRAPH PersonCityEvents
663
+
MATCH (ce:CriminalEvent)-[:HAPPENED_IN]->(c:City)<-[:BORN_IN]-(p:Person)
664
+
INTO NEW GRAPH Temp-PersonCityCrimes {
665
+
CREATE XXX TODO
666
+
}
667
+
----
668
+
669
+
And, as the final step of the entire data integration pipeline, return *Temp-PersonCityCrimes*, which is comprised of the following data:
0 commit comments