|
| 1 | += CIP1017-02-06 Regular Path Patterns |
| 2 | +:numbered: |
| 3 | +:toc: |
| 4 | +:toc-placement: macro |
| 5 | +:source-highlighter: codemirror |
| 6 | + |
| 7 | +*Authors:* Tobias Lindaaker < tobias[email protected]> |
| 8 | + |
| 9 | +toc::[] |
| 10 | + |
| 11 | +== Regular Path Patterns |
| 12 | + |
| 13 | +Above and beyond the types of patterns that can be expressed in Cypher using the normal path syntax, Cypher also supports what amounts to regular expressions over paths. |
| 14 | +This functionality is called Regular Path Patterns. |
| 15 | + |
| 16 | +A Regular Path Pattern is defined as: |
| 17 | + |
| 18 | +• A simple relationship type, or |
| 19 | +• A Regular Path Pattern followed by another Regular Path Pattern, or |
| 20 | +• An alternative between two Regular Path Patterns, or |
| 21 | +• A repetition of a Regular Path Pattern, or |
| 22 | +• A reference to a Defined Path Predicate. |
| 23 | + |
| 24 | +Regular Path Patterns are written similarly to how relationship patterns are written, but enclosed within two slash (`/`) characters instead of brackets (`[]`). |
| 25 | + |
| 26 | +Contrary to Relationship Patterns, Regular Path Patterns do _not_ allow binding a relationship to a variable. |
| 27 | +In order to bind the matching path to a variable, a Path Assignment should be used, by preceding the path with an identifier and an equals sign (`=`). |
| 28 | +This avoids a problem that existed in the past with repetition of relationships (a syntax that was deprecated with the introduction of Regular Path Patterns), where a relationship variable would bind to a list, making it hard to express predicates over the actual relationships. |
| 29 | +Predicates on parts of a Regular Path Pattern are instead expressed through the use of explicitly defined path predicates. |
| 30 | + |
| 31 | +=== Syntax |
| 32 | + |
| 33 | +The syntax of Regular Path Patterns fit into the greater Cypher syntax through `PatternElementChain`. |
| 34 | + |
| 35 | +---- |
| 36 | +PatternElementChain = (RelationshipPattern | RegularPathPattern), NodePattern ; |
| 37 | +
|
| 38 | +RegularPathPattern = (LeftArrowHead, Dash, '/', [RegularPathExpression], '/', Dash, RightArrowHead) |
| 39 | + | (LeftArrowHead, Dash, '/', [RegularPathExpression], '/', Dash) |
| 40 | + | (Dash, '/', [RegularPathExpression], '/', Dash, RightArrowHead) |
| 41 | + | (Dash, '/', [RegularPathExpression], '/', Dash) |
| 42 | + ; |
| 43 | +RegularPathExpression = {RegPathOr}- ; |
| 44 | +RegPathOr = RegPathSeq, {'|', RegPathSeq} ; |
| 45 | +RegPathSeq = {RegPathStar}- ; |
| 46 | +RegPathStar = RegPathDirected [('*', [RangeLiteral]) | '+'] ; |
| 47 | +RegPathDirected = ['<'], RegPathBase, ['>'] ; |
| 48 | +RegPathBase = RegPathRelationship |
| 49 | + | RegPathReference |
| 50 | + | '(' RegularPathExpression ')' |
| 51 | + ; |
| 52 | +RegPathRelationship = RelType ; |
| 53 | +RegPathReference = SymbolicName ; |
| 54 | +---- |
| 55 | + |
| 56 | +The `RegPathReference` is a reference to a Defined Path Predicate. |
| 57 | +These are defined using the following syntax: |
| 58 | + |
| 59 | +---- |
| 60 | +DefinedPathPredicate = PathPredicatePrototype, 'IS', Pattern, [Where] ; |
| 61 | +PathPredicatePrototype = '(', Variable, ')', RegPathPrototype, '(', Variable, ')' ; |
| 62 | +RegPathPrototype = (LeftArrowHead, Dash, '/', DefinedPathName, '/', Dash) |
| 63 | + | (Dash, '/', DefinedPathName, '/', Dash, RightArrowHead) |
| 64 | + | (Dash, '/', DefinedPathName, '/', Dash) |
| 65 | + ; |
| 66 | +DefinedPathName = SymbolicName ; |
| 67 | +---- |
| 68 | + |
| 69 | +=== Examples |
| 70 | + |
| 71 | +The astute reader of the syntax will have noticed that it is possible to express a Regular Path Pattern with an empty path expression: |
| 72 | + |
| 73 | +[source, cypher] |
| 74 | +---- |
| 75 | +MATCH (a)-//-(b) |
| 76 | +---- |
| 77 | + |
| 78 | +This pattern simply states that `a` and `b` must be the same node, and is thus the same as: |
| 79 | + |
| 80 | +[source, cypher] |
| 81 | +---- |
| 82 | +MATCH (a), (b) WHERE a = b |
| 83 | +---- |
| 84 | + |
| 85 | +The same reader will also have noticed that it is possible to define a pattern containing just a relationship type: |
| 86 | + |
| 87 | +[source, cypher] |
| 88 | +---- |
| 89 | +MATCH (a)-/:KNOWS/->(b) |
| 90 | +---- |
| 91 | + |
| 92 | +That pattern is indeed equivalent to the very similar relationship pattern: |
| 93 | + |
| 94 | +[source, cypher] |
| 95 | +---- |
| 96 | +MATCH (a)-[:KNOWS]->(b) |
| 97 | +---- |
| 98 | + |
| 99 | +The main difference being that the variant with a relationship pattern is able to bind that relationship and express further predicates over it. |
| 100 | + |
| 101 | +The Regular Path Patterns start becoming interesting when larger expressions are put together: |
| 102 | + |
| 103 | +[source, cypher] |
| 104 | +.Finding someone loved by someone hated by someone you know, transitively |
| 105 | +---- |
| 106 | +MATCH (you)-/(:KNOWS :HATES)+ :LOVES/->(someone) |
| 107 | +---- |
| 108 | + |
| 109 | +Note the `+` expressing one or more occurrences of the sequence `KNOWS` followed by `HATES`. |
| 110 | + |
| 111 | +The direction of each relationship is governed by the overall direction of the Regular Path Pattern. |
| 112 | +It is however possible to explicitly define the direction for a particular part of the pattern. |
| 113 | +This is done by either prefixing that part with `<` for a right-to-left direction or suffixing it with `>` for a left-to-right direction. |
| 114 | +It is possible to both prefix the part with `<` and suffixing it with `>`, giving that part the interpretation of being undirected. |
| 115 | + |
| 116 | +[source, cypher] |
| 117 | +.Specifying the direction for different parts of the pattern |
| 118 | +---- |
| 119 | +MATCH (you)-/(:KNOWS <:HATES)+ :LOVES/->(someone) |
| 120 | +---- |
| 121 | + |
| 122 | +In the example above we say that the `HATES` relationships should have the opposite direction to the other relationships in the path. |
| 123 | + |
| 124 | +Through the use of Defined Path Predicates we can express even more predicates over a path: |
| 125 | + |
| 126 | +[source, cypher] |
| 127 | +.Find a chain of unreciprocated lovers |
| 128 | +---- |
| 129 | +MATCH (you)-/unreciprocated_love*/->(someone) |
| 130 | +PATH (a)-/unreciprocated_love/->(b) IS |
| 131 | + (a)-[:LOVES]->(b) |
| 132 | + WHERE NOT EXISTS { (b)-[:LOVES]->(a) } |
| 133 | +---- |
| 134 | + |
| 135 | +Note how there is no colon used for referencing the Defined Path Predicate, the colon is used in Regular Path Patterns only for referencing actual relationship types. |
| 136 | + |
| 137 | +Sometimes it will be interesting to express a predicate on a node in a Regular Path Pattern. |
| 138 | +This can be achieved by using a Defined Path Predicate where the nodes on both ends are the same: |
| 139 | + |
| 140 | +[source, cypher] |
| 141 | +.Find friends of friends that are not haters |
| 142 | +---- |
| 143 | +MATCH (you)-/:KNOWS not_a_hater :KNOWS/-(friendly_friend_of_friend) |
| 144 | +PATH (x)-/not_a_hater/-(x) IS (x) |
| 145 | + WHERE NOT EXISTS { (x)-[:HATES]->() } |
| 146 | +---- |
| 147 | + |
| 148 | +In the case of a Defined Path Predicate where both nodes are the same, the direction of the predicate is irrelevant. |
| 149 | +In general the direction of a Defined Path Predicate is quite important, and used for mapping the pattern in the predicate into the Regular Path Patterns that reference it. |
| 150 | +The only cases where it is allowed to omit the direction of a Defined Path Predicate is when the defined predicate is reflexive. |
| 151 | +This is obviously the case when both nodes are the same, but it would also be the case when the internal pattern is symmetrical, such as in the following example: |
| 152 | + |
| 153 | +[source, cypher] |
| 154 | +.Find chains of co-authorship |
| 155 | +---- |
| 156 | +MATCH (you)-/co_author*/-(someone) |
| 157 | +PATH (a)-/co_author/-(b) IS |
| 158 | + (a)-[:AUTHORED]->(:Book)<-[:AUTHORED]-(b) |
| 159 | + WHERE a <> b |
| 160 | +---- |
0 commit comments