Skip to content

Commit 560c9cb

Browse files
committed
update gremlin docs
1 parent 72217ca commit 560c9cb

File tree

3 files changed

+61
-52
lines changed

3 files changed

+61
-52
lines changed

docs/administration-guide/gaffer-deployment/gremlin.md

+51-39
Original file line numberDiff line numberDiff line change
@@ -28,27 +28,21 @@ traversals are spawned. To do this we recommend utilising the provided
2828
which can be configured to use the Gaffer Tinkerpop implementation so that a
2929
endpoint is available for Gremlin queries.
3030

31-
## Connecting to Any Existing Gaffer Graph
31+
## Connecting to An Existing Accumulo Backed Graph
3232

33-
The simplest way to connect Gremlin to an existing Gaffer instance where you may
34-
not know the Store type or Schema would be via a [Proxy Store](../gaffer-stores/proxy-store.md).
35-
Connecting this way means Gremlin communicates via the Gaffer REST API
36-
(similar to [gafferpy](../../user-guide/apis/python-api.md)) meaning there may
37-
be a performance hit for larger queries.
38-
39-
!!! tip
40-
You can also of course connect directly to an existing instance's storage
41-
layer too (e.g. Accumulo store) but this would require a more complex
42-
configuration and knowledge of the Schema.
33+
The recommended way to provide a Gremlin interface to an existing Gaffer
34+
instance is to connect directly to the same [Accumulo store](../gaffer-stores/accumulo-store.md).
35+
Connecting this way means Gremlin communicates in a similar way to the Gaffer
36+
REST API and ensures the fastest performance when using Gremlin (there may still
37+
be a performance hit).
4338

4439
The general connection diagram looks something like the following:
4540

4641
```mermaid
4742
flowchart LR
4843
A(["User"])
4944
--> B("Gremlin Server")
50-
--> C(Gaffer Proxy Store)
51-
--> D(Existing Gaffer Instance)
45+
--> C("Accumulo Store")
5246
```
5347

5448
To establish this connection you can make use of the existing `gaffer-gremlin`
@@ -61,44 +55,58 @@ docker pull gchq/gaffer-gremlin:latest
6155
```
6256

6357
!!! note
64-
You will likely need to configure the default `gaffer-gremlin` image to your
58+
You will need to configure the default `gaffer-gremlin` image to your
6559
environment, please continue reading to learn more.
6660

67-
### Configuring the `gaffer-gremlin` Image
61+
### The `gaffer-gremlin` Image
6862

69-
To use the image you will need to provide two configuration files that are specific
70-
to your environment, they are:
63+
To use the image you will need to provide the normal Gaffer configuration files
64+
for to your environment along with a new GafferPop specific file (similar to the
65+
standard graph config JSON) they are:
7166

72-
- `store.properties` - Gaffer store configuration.
73-
- `gafferpop.properties` - Configuration for the Gaffer Tinkerpop library (Gafferpop).
67+
- `store.properties` - Gaffer store configuration, this should match the
68+
existing graph you are connecting to.
69+
- `elements.json` and `types.json` - The schema files for the graph you wish to
70+
connect to.
71+
- `gafferpop.properties` - Configuration for the Gaffer Tinkerpop library
72+
(Gafferpop).
7473

75-
Once these files are configured you can use bind mounts to make them available when running the image:
74+
Please read the subsections below on how to configure these files. Once these
75+
are configured you can use bind mounts to make them available when running the
76+
image:
7677

7778
```bash
7879
docker run \
7980
--name gaffer-gremlin \
8081
--publish 8182:8182 \
81-
--volume store.properties:conf/gaffer/store.properties \
82-
--volume gafferpop.properties:conf/gafferpop/gafferpop.properties \
82+
--volume store.properties:/opt/gremlin-server/conf/gaffer/store.properties \
83+
--volume schema:/opt/gremlin-server/conf/gaffer/schema \
84+
--volume gafferpop.properties:/opt/gremlin-server/conf/gafferpop/gafferpop.properties \
8385
tinkerpop/gremlin-server:latest gremlin-server.yaml
8486
```
8587

86-
#### Configuring the Proxy Store
88+
### Configuring the Store Properties
89+
90+
Starting with the Store properties, this file should be largely identical to
91+
the store properties used on the main Gaffer deployment. The main purpose
92+
of this file is to ensure the same Accumulo cluster is connected to.
8793

88-
Starting with the Proxy Store, this is identical to running a normal [Proxy Store](../gaffer-stores/proxy-store.md)
89-
and involves simply creating a Gaffer `store.properties` file to use. An example
90-
`store.properties` file is given below that will connect to a graph's REST API
91-
running at `https://localhost:8080/rest`:
94+
An example file is given below, please read the specific [Accumulo store](../gaffer-stores/accumulo-store.md)
95+
documentation for more detail:
9296

9397
```properties
94-
gaffer.store.class=uk.gov.gchq.gaffer.proxystore.ProxyStore
95-
# These should be configured to an existing graph deployment
96-
gaffer.host=localhost
97-
gaffer.port=8080
98-
gaffer.context-root=/rest
98+
gaffer.store.class=uk.gov.gchq.gaffer.accumulostore.AccumuloStore
99+
gaffer.store.properties.class=uk.gov.gchq.gaffer.accumulostore.AccumuloProperties
100+
accumulo.instance=accumulo
101+
accumulo.zookeepers=zookeeper
102+
accumulo.user=root
103+
accumulo.password=secret
104+
# General store config
105+
gaffer.cache.service.class=uk.gov.gchq.gaffer.cache.impl.HashMapCacheService
106+
gaffer.store.job.tracker.enabled=true
99107
```
100108

101-
#### Configuring the Gafferpop Library
109+
### Configuring the Gafferpop Library
102110

103111
The `gafferpop.properties`, file is the configuration for the Gaffer
104112
implementation of Tinkerpop (a.k.a Gafferpop). Most of the set up here is for
@@ -109,11 +117,15 @@ would look like the following:
109117
```properties
110118
# The Tinkerpop graph class we should use
111119
gremlin.graph=uk.gov.gchq.gaffer.tinkerpop.GafferPopGraph
112-
gaffer.graphId=graphProxy
120+
gaffer.graphId=existingGraph
113121
gaffer.storeproperties=conf/gaffer/store.properties
114122
gaffer.userId=user01
115123
```
116124

125+
!!! note
126+
It is important the `graphId` here matches the ID of the main graph you
127+
wish to connect to as this controls which Accumulo table is connected to.
128+
117129
Many of these properties in the example above should be self explanatory, a full breakdown of
118130
of the available properties is as follows:
119131

@@ -123,11 +135,11 @@ of the available properties is as follows:
123135
| `gaffer.graphId` | The graph ID of the Tinkerpop graph |
124136
| `gaffer.storeproperties` | The path to the store properties file |
125137
| `gaffer.schemas` | The path to the directory containing the graph schema files |
126-
| `gaffer.userId` | The user ID for the Tinkerpop graph |
127-
| `gaffer.dataAuths` | The data auths for the user to specify what operations can be performed |
128-
| `gaffer.operation.options` | Additional operation options that will be passed to the Tinkerpop graph variables in the form `key:value`
138+
| `gaffer.userId` | The default user ID for the Tinkerpop graph (see the [authentication section](#user-authentication)) |
139+
| `gaffer.dataAuths` | The default data auths for the user to specify what operations can be performed |
140+
| `gaffer.operation.options` | Default `Operation` options in the form `key:value` (this can be overridden per query see [here](../../user-guide/query/gremlin/gremlin.md#custom-features)) |
129141

130-
#### Configuring the Gremlin Server
142+
### Configuring the Gremlin Server
131143

132144
The underlying Gremlin server can also be configured if required. The `gaffer-gremlin`
133145
image comes with an existing YAML configuration based on the example from the
@@ -155,7 +167,7 @@ uk.gov.gchq.gaffer.tinkerpop.gremlinplugin.GafferPopGremlinPlugin: {}
155167
See the [Tinkerpop docs](https://tinkerpop.apache.org/docs/current/reference/#gremlin-server)
156168
for more information on Gremlin server configuration.
157169
158-
##### User Authentication
170+
#### User Authentication
159171
160172
Full user authentication is possible with the Gremlin server using the framework
161173
provided by standard Tinkerpop. The GafferPop implementation provides a

docs/user-guide/query/gremlin/gremlin-limits.md

+9-13
Original file line numberDiff line numberDiff line change
@@ -6,30 +6,26 @@ but some features may also be yet to be implemented.
66

77
Current TinkerPop features not present in the GafferPop implementation:
88

9-
- Property index for allowing unseeded queries (unseeded queries run a `GetAllElements`).
9+
- Unseeded queries run a `GetAllElements` with a configured limit applied,
10+
this limit can be configured per query or will default to 5000.
1011
- Gaffer graphs are readonly to Gremlin queries.
11-
support this.
1212
- TinkerPop Graph Computer is not supported.
1313
- TinkerPop Transactions are not supported.
1414
- TinkerPop Lambdas are not supported.
1515

1616
Current known limitations or bugs:
1717

18-
- Proper user authentication is only available if using a Gremlin server to
19-
connect to the graph.
18+
- Proper user authentication is only available if using a Gremlin server and
19+
the `GafferPopAuthoriser` class.
2020
- Performance compared to standard Gaffer `OperationChain`s will likely be
2121
slower as multiple Gaffer `Operations` may utilised to perform one Gremlin
2222
step.
23-
- The entity group `id` is reserved for an empty group containing only the
24-
vertex ID, this is currently used as a workaround for other limitations.
25-
- When you get the in or out Vertex directly off an Edge it will not contain any
26-
actual properties or be in correct group/label - it just returns a vertex in
27-
the `id` group. This is due to Gaffer allowing multiple entities to be
28-
associated with the source and destination vertices of an Edge.
2923
- The ID of an Edge follows a specific format that is made up of its source and
3024
destination IDs like `[source, dest]`. To use this in a seeded query you must
3125
format it like `g.E("[source, dest]")` or a list like
3226
`g.E(["[source1, dest1]","[source2, dest2]"])`
33-
- Issues seen using `hasKey()` and `hasValue()` in same query.
34-
- May experience issues using the `range()` query function.
35-
- May experience issues using the `where()` query function.
27+
- The entity group `id` is reserved for an empty group containing only the
28+
vertex ID, this is currently used as a workaround for other limitations.
29+
- Chaining `hasLabel()` calls together like `hasLabel("label1").hasLabel("label2")`
30+
will act like an OR rather than an AND in standard Gremlin. This means you
31+
may get results back when you realistically shouldn't.

docs/user-guide/query/gremlin/gremlin.md

+1
Original file line numberDiff line numberDiff line change
@@ -270,3 +270,4 @@ for Gaffer specific options:
270270
| --- | --- | --- |
271271
| `operationOptions` | `g.with("operationOptions", "gaffer.federatedstore.operation.graphIds:graphA").V()` | Allows passing options to the underlying Gaffer Operations, this is the same as the `options` field on a standard JSON query. |
272272
| `getAllElementsLimit` | `g.with("getAllElementsLimit", 100).V()` | Limits the amount of elements returned if performing an unseeded query e.g. a `GetAllElements` operation. |
273+
| `hasStepFilterStage` | `g.with("hasStepFilterStage", "PRE_AGGREGATION").V()` | Controls which phase the filtering from a Gremlin `has()` stage is applied to the results. |

0 commit comments

Comments
 (0)