Skip to content

Commit 7f258be

Browse files
committed
More examples
1 parent cbb52a8 commit 7f258be

File tree

1 file changed

+70
-186
lines changed

1 file changed

+70
-186
lines changed

cip/1.accepted/CIP2017-06-18-multiple-graphs.adoc

+70-186
Original file line numberDiff line numberDiff line change
@@ -720,7 +720,76 @@ This is the final step of the entire data integration pipeline, we return this g
720720

721721
image::opencypher-PersonCityCriminalEvents-graph.jpg[Graph,700,550]
722722

723-
// ._The full data integration query pipeline is given by_:
723+
724+
[[data-aggregation-example]]
725+
=== Using a pipeline to perform aggregations and return tabular data and graphs
726+
727+
This example shows how to aggregate detailed sales data within a graph -- in effect, performing a 'roll-up' -- in order to obtain a high-level summarized view of the data, as a graph.
728+
The summarized graph may be used to draw further high-level reports, but may also be used to undertake 'drill-down' actions by probing into the graph to extract more detailed information.
729+
730+
Assume we have the graph *SalesDetail*, representing the sale of products in stores across various regions:
731+
732+
image::opencypher-SalesDetail-graph.jpg[Graph,800,700]
733+
734+
This models the following entities:
735+
736+
* Regions may have many stores.
737+
* Stores:
738+
** A store is identified by a unique `code`.
739+
** A store is contained in exactly one region.
740+
** A store may have multiple orders.
741+
* Products:
742+
** A product is identified by a unique `code`.
743+
** A product has a `RRP` property (Recommended Retail Price).
744+
** A product may appear in one or more orders as a product _item_.
745+
* Sales orders:
746+
** An order is identified by a unique order number, given by `num`.
747+
** The `YYYYMM` property represents the year and month portion of the date of the order.
748+
** An order is associated with exactly one store and contains one or more product items, representing the fact that the product item was sold in the store and is a part of the order.
749+
** The relationship of between an order and a product contains the following properties:
750+
*** `soldPrice`: the price at which the product item was actually sold (usually lower than the product's RRP).
751+
*** `numItemsSold`: the number of the actual product items sold in the order.
752+
753+
The following pipeline will create a summarized graph view of this data and return it.
754+
755+
[source, cypher]
756+
----
757+
[ 0] FROM SalesDetail
758+
[ 1] MATCH (p:Product)-[r:IN]->(:Order)<-[HAS]-(s:Store)
759+
[ 2] WITH p, s, sum(r.soldPrice * r.numItemsSold) AS storeProductTotal
760+
[ 3] CONSTRUCT ON GRAPH CLONE p, s
761+
[ 4] CREATE (p)-[:SUMMARY {totalSales: storeProductTotal}]->(s)
762+
763+
[ 5] WITH p, sum(storeProductTotal) AS productTotal
764+
[ 6] CONSTRUCT ON GRAPH CLONE p
765+
[ 7] CREATE (p)-[:SUMMARY]->(:SUMMARY {totalSales: productTotal})
766+
767+
[ 8] WITH p
768+
[ 9] MATCH (p)-[r:SUMMARY]-(s:Store)-[:IN]-(reg:Region)
769+
[10] WITH s, reg, sum(r.totalSales) AS storeTotal
770+
[11] CONSTRUCT ON GRAPH CLONE s, reg
771+
[12] CREATE (s)-[:SUMMARY]->({totalSales: storeTotal})
772+
[13] WITH reg, sum(storeTotal) AS regionTotal
773+
[14] CREATE (reg)-[:SUMMARY]->({totalSales: storeTotal})
774+
775+
[15] WITH reg
776+
[16] MATCH (reg)<-[:IN]-(:Store)-[summary:SUMMARY]->(p:Product)
777+
[17] WITH r, p, sum(summary.totalSales) as regionProductTotal
778+
[18] CONSTRUCT ON GRAPH CLONE r, p
779+
[19] CREATE (r)-[:SUMMARY {totalSales: regionProductTotal}]->(p)
780+
[20] RETURN GRAPH
781+
----
782+
783+
784+
We start by specifying that we are working on SalesDetails [0], and then find all orders and which store they were created in [1].
785+
The next step is to sum up all sales grouped by the product and the store [2]. Next, we start building up the summary graph by cloning the detail graph and adding a summary relationship directly between the Product and the Store, not going throught the order node.
786+
787+
Next up, we aggregate up all sales by product [5], and use this information to construct a graph [6] and add a summary relationship to the product node [7].
788+
789+
So far, we have been using the matches from the first MATCH[0], but now it's time to drop the incoming driving table [8] and start matching[9] from scratch again. We are matching for the summary relationships we added in [4] between stores and products, and using this to
790+
791+
// TODO: Finish explaining this example
792+
724793

725794

726795
//
@@ -796,191 +865,6 @@ image::opencypher-PersonCityCriminalEvents-graph.jpg[Graph,700,550]
796865
//
797866

798867
//
799-
// [[data-aggregation-example]]
800-
// === Using a pipeline to perform aggregations and return tabular data and graphs
801-
//
802-
// This example shows how to aggregate detailed sales data within a graph -- in effect, performing a 'roll-up' -- in order to obtain a high-level summarized view of the data, stored and returned in another graph, as well as returning an even higher-level view as an executive report.
803-
// The summarized graph may be used to draw further high-level reports, but may also be used to undertake 'drill-down' actions by probing into the graph to extract more detailed information.
804-
//
805-
// Assume we have the graph *SalesDetail*, representing the sale of products in stores across various regions:
806-
//
807-
// image::opencypher-SalesDetail-graph.jpg[Graph,800,700]
808-
//
809-
// This models the following entities:
810-
//
811-
// * Regions may have many stores.
812-
// * Stores:
813-
// ** A store is identified by a unique `code`.
814-
// ** A store is contained in exactly one region.
815-
// ** A store may have multiple orders.
816-
// * Products:
817-
// ** A product is identified by a unique `code`.
818-
// ** A product has a `RRP` property (Recommended Retail Price).
819-
// ** A product may appear in one or more orders as a product _item_.
820-
// * Sales orders:
821-
// ** An order is identified by a unique order number, given by `num`.
822-
// ** The `YYYYMM` property represents the year and month portion of the date of the order.
823-
// ** An order is associated with exactly one store and contains one or more product items, representing the fact that the product item was sold in the store and is a part of the order.
824-
// ** The relationship of between an order and a product contains the following properties:
825-
// *** `soldPrice`: the price at which the product item was actually sold (usually lower than the product's RRP).
826-
// *** `numItemsSold`: the number of the actual product items sold in the order.
827-
//
828-
// The following pipeline will create a summarized view of this data, and store it in a new summary graph called *SalesSummary*.
829-
//
830-
// We begin by referencing the *SalesDetail* graph, and matching on all products in all orders for all stores in all regions.
831-
//
832-
// [source, cypher]
833-
// ----
834-
// FROM GRAPH SalesDetail AT ‘graph://...’
835-
// MATCH (p:Product)-[r:IN]->(o:Order)<-[HAS]-(s:Store)-[:IN]->(reg:Region)
836-
// ----
837-
//
838-
// We aggregate the (tabular) data across all orders in order to obtain the total sales amount grouped by the product, store and region, and alias this value as `storeProductTotal`.
839-
// As this tabular data is required to populate the summary graph later on, we pass it further down the pipeline:
840-
//
841-
// [source, cypher]
842-
// ----
843-
// WITH reg.name AS regionName,
844-
// s.code AS storeCode,
845-
// p.code AS productCode,
846-
// sum(r.soldPrice * r.numItemsSold) AS storeProductTotal
847-
// ----
848-
//
849-
// The tabular data consists of the following:
850-
//
851-
// [source, cypher]
852-
// ----
853-
// +------------+-----------+-------------+-------------------+
854-
// | regionName | storeCode | productCode | storeProductTotal |
855-
// +------------+-----------+-------------+-------------------+
856-
// | APAC | AC-888 | PEN-1 | 20.00 |
857-
// | APAC | AC-888 | TOY-1 | 45.00 |
858-
// | EMEA | LK-709 | BOOK-2 | 10.00 |
859-
// | EMEA | LK-709 | TOY-1 | 40.00 |
860-
// | EMEA | LK-709 | BOOK-5 | 15.00 |
861-
// | EMEA | WW-531 | BOOK-5 | 18.00 |
862-
// | EMEA | WW-531 | BULB-2 | 190.00 |
863-
// | EMEA | WW-531 | PC-1 | 440.00 |
864-
// +------------+-----------+-------------+-------------------+
865-
// 8 rows
866-
// ----
867-
//
868-
// Next, we read from the *SalesDetail* graph to get the store, product and region information:
869-
//
870-
// [source, cypher]
871-
// ----
872-
// MATCH (p:Product)-[:IN]->(o:Order)<-[:HAS]-(s:Store)-[:IN]->(r:Region)
873-
// ----
874-
//
875-
// We now create a new graph, *SalesSummary*, containing the summarized view of the sales information across regions, products and stores:
876-
//
877-
// [source, cypher]
878-
// ----
879-
// INTO NEW GRAPH SalesSummary
880-
// MERGE (s:Store {storeCode: s.code})
881-
// MERGE (r:Region {name: r.name})
882-
// MERGE (p:Product {productCode: p.code, RRP: p.RRP})
883-
// MERGE (s)-[:IN]->(r)
884-
// MERGE (p)-[:SOLD_IN]->(s)
885-
//
886-
// // Get the total amount sold for a store
887-
// WITH storeCode, sum(storeProductTotal) AS totalSales
888-
// // Get the total amount sold for a product
889-
// WITH productCode, sum(storeProductTotal) AS soldTotal
890-
//
891-
// // Update all store nodes with the new totalSales property
892-
// MATCH (s:Store)
893-
// SET s.totalSales = totalSales
894-
// WHERE s.code = storeCode
895-
//
896-
// // Update all product nodes with the new soldTotal property
897-
// MATCH (p:Product)
898-
// SET p.soldTotal = soldTotal
899-
// WHERE p.code = productCode
900-
//
901-
// // Update all (:Product)-[SOLD_IN]->(:Store) relationships with the new sold property
902-
// MATCH (p:Product)-[r:SOLD_IN]->(s:Store)
903-
// SET r.sold = storeProductTotal
904-
// WHERE p.code = productCode
905-
// AND s.code = storeCode
906-
// ----
907-
//
908-
// As a final step, the *SalesSummary* graph is returned, along with a high-level summarized tabular view of store sales data.
909-
//
910-
// [source, cypher]
911-
// ----
912-
// RETURN regionName,
913-
// storeCode,
914-
// sum(storeProductTotal) AS totalStoreSales
915-
// GRAPH SalesSummary
916-
// ----
917-
//
918-
// The *SalesSummary* graph is comprised of the following:
919-
//
920-
// image::opencypher-SalesSummary-graph.jpg[Graph,800,700]
921-
//
922-
// The high-level summarized tabular data consists of the following:
923-
//
924-
// [source, cypher]
925-
// ----
926-
// +------------+-----------+-----------------+
927-
// | regionName | storeCode | totalStoreSales |
928-
// +------------+-----------+-----------------+
929-
// | APAC | AC-888 | 65.00 |
930-
// | EMEA | LK-709 | 65.00 |
931-
// | EMEA | WW-531 | 648.00 |
932-
// +------------+-----------+-----------------+
933-
// 3 rows
934-
// ----
935-
//
936-
// We note that the *SalesSummary* graph can be used to generate further high-level sales summaries, such as the total sales of a particular product (shown <<data-aggregation-external-example, here>>), as well as more detailed views.
937-
//
938-
// ._The full aggregation query pipeline is given by_:
939-
// [source, cypher]
940-
// ----
941-
// FROM GRAPH SalesDetail AT ‘graph://...’
942-
// MATCH (p:Product)-[r:IN]->(o:Order)<-[HAS]-(s:Store)-[:IN]->(reg:Region)
943-
//
944-
// WITH reg.name AS regionName,
945-
// s.code AS storeCode,
946-
// p.code AS productCode,
947-
// sum(r.soldPrice * r.numItemsSold) AS storeProductTotal
948-
//
949-
// MATCH (p:Product)-[:IN]->(o:Order)<-[:HAS]-(s:Store)-[:IN]->(r:Region)
950-
//
951-
// INTO NEW GRAPH SalesSummary
952-
// MERGE (s:Store {code: s.code})
953-
// MERGE (r:Region {name: r.name})
954-
// MERGE (p:Product {code: p.code, RRP: p.RRP})
955-
// MERGE (s)-[:IN]->(r)
956-
// MERGE (p)-[:SOLD_IN]->(s)
957-
//
958-
// // Get the total amount sold for a store
959-
// WITH storeCode, sum(storeProductTotal) AS totalSales
960-
// //Get the total amount sold for a product
961-
// WITH productCode, sum(storeProductTotal) AS soldTotal
962-
//
963-
// // Update all store nodes with the new totalSales property
964-
// MATCH (s:Store)
965-
// SET s.totalSales = totalSales
966-
// WHERE s.code = storeCode
967-
//
968-
// // Update all product nodes with the new soldTotal property
969-
// MATCH (p:Product)
970-
// SET p.soldTotal = soldTotal
971-
// WHERE p.code = productCode
972-
//
973-
// // Update all (:Product)-[SOLD_IN]->(:Store) relationships with the new sold property
974-
// MATCH (p:Product)-[r:SOLD_IN]->(s:Store)
975-
// SET r.sold = storeProductTotal
976-
// WHERE p.code = productCode
977-
// AND s.code = storeCode
978-
//
979-
// RETURN regionName,
980-
// storeCode,
981-
// sum(storeProductTotal) AS totalStoreSales
982-
// GRAPH SalesSummary
983-
// ----
984868
//
985869
// [[data-aggregation-external-example]]
986870
// === Using a pipeline in an external execution context

0 commit comments

Comments
 (0)