@@ -9,12 +9,37 @@ The operator itself is built with the [Operator framework](https://github.com/op
9
9
10
10
It inspired by [ spotahome/redis-operator] ( https://github.com/spotahome/redis-operator ) .
11
11
12
+ ![ Redis Cluster atop Kubernetes] ( /static/redis-sentinel-readme.png )
13
+
14
+ * Create a statefulset to mange Redis instances (masters and replicas), each redis instance has default PreStop script that can do failover if master is down.
15
+ * Create a statefulset to mange Sentinel instances that will control the Redis nodes, each Sentinel instance has default ReadinessProbe script to detect whether the current sentinel's status is ok. When a sentinel pod is not ready, it is removed from Service load balancers.
16
+ * Create a Service and a Headless service for Sentinel statefulset.
17
+ * Create a Headless service for Redis statefulset.
18
+
19
+ Table of Contents
20
+ =================
21
+
22
+ * [ redis-operator] ( #redis-operator )
23
+ * [ Overview] ( #overview )
24
+ * [ Prerequisites] ( #prerequisites )
25
+ * [ Features] ( #features )
26
+ * [ Quick Start] ( #quick-start )
27
+ * [ Deploy redis operator] ( #deploy-redis-operator )
28
+ * [ Deploy a sample redis cluster] ( #deploy-a-sample-redis-cluster )
29
+ * [ Resize an Redis Cluster] ( #resize-an-redis-cluster )
30
+ * [ Create redis cluster with password] ( #create-redis-cluster-with-password )
31
+ * [ Dynamically changing redis config] ( #dynamically-changing-redis-config )
32
+ * [ Persistence] ( #persistence )
33
+ * [ Custom SecurityContext] ( #custom-securitycontext )
34
+ * [ Cleanup] ( #cleanup )
35
+ * [ Automatic failover details] ( #automatic-failover-details )
36
+
12
37
## Prerequisites
13
38
14
- * go version v1.12 +.
15
- * Access to a Kubernetes v1.11.3 + cluster.
39
+ * go version v1.13 +.
40
+ * Access to a Kubernetes v1.13.10 + cluster.
16
41
17
- ## Capabilities
42
+ ## Features
18
43
In addition to the sentinel's own capabilities, redis-operator can:
19
44
20
45
* Push events and update status to the Kubernetes when resources have state changes
@@ -28,7 +53,7 @@ In addition to the sentinel's own capabilities, redis-operator can:
28
53
## Quick Start
29
54
30
55
### Deploy redis operator
31
- Build and push the redis-operator and e2e test image
56
+ Build and push the redis-operator image
32
57
```
33
58
$ make REGISTRY=you_public_registry build-image
34
59
$ make REGISTRY=you_public_registry push
@@ -92,33 +117,32 @@ Verify that the cluster instances and its components are running.
92
117
```
93
118
$ kubectl get rediscluster
94
119
NAME SIZE STATUS AGE
95
- test 3 Healthy 22h
120
+ test 3 Healthy 4m9s
96
121
97
122
$ kubectl get all -l app.kubernetes.io/managed-by=redis-operator
98
- NAME READY STATUS RESTARTS AGE
99
- pod/redis-cluster-test-0 1/1 Running 0 22h
100
- pod/redis-cluster-test-1 1/1 Running 0 22h
101
- pod/redis-cluster-test-2 1/1 Running 0 22h
102
- pod/redis-sentinel-test-7cbd85785b-6llfp 1/1 Running 0 22h
103
- pod/redis-sentinel-test-7cbd85785b-ggqw4 1/1 Running 0 22h
104
- pod/redis-sentinel-test-7cbd85785b-nxxfc 1/1 Running 0 22h
105
-
106
- NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
107
- service/redis-sentinel-test ClusterIP xxxxxxxxxx <none> 26379/TCP 22h
108
-
109
- NAME READY UP-TO-DATE AVAILABLE AGE
110
- deployment.apps/redis-sentinel-test 3/3 3 3 22h
111
-
112
- NAME DESIRED CURRENT READY AGE
113
- replicaset.apps/redis-sentinel-test-7cbd85785b 3 3 3 22h
114
-
115
- NAME READY AGE
116
- statefulset.apps/redis-cluster-test 3/3 22h
123
+ NAME READY STATUS RESTARTS AGE
124
+ pod/redis-cluster-test-0 1/1 Running 0 4m16s
125
+ pod/redis-cluster-test-1 1/1 Running 0 3m22s
126
+ pod/redis-cluster-test-2 1/1 Running 0 2m40s
127
+ pod/redis-sentinel-test-0 1/1 Running 0 4m16s
128
+ pod/redis-sentinel-test-1 1/1 Running 0 81s
129
+ pod/redis-sentinel-test-2 1/1 Running 0 18s
130
+
131
+ NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
132
+ service/redis-cluster-test ClusterIP None <none> 6379/TCP 4m16s
133
+ service/redis-sentinel-headless-test ClusterIP None <none> 26379/TCP 4m16s
134
+ service/redis-sentinel-test ClusterIP 10.22.22.34 <none> 26379/TCP 4m16s
135
+
136
+ NAME READY AGE
137
+ statefulset.apps/redis-cluster-test 3/3 4m16s
138
+ statefulset.apps/redis-sentinel-test 3/3 4m16s
117
139
```
118
140
119
141
* redis-cluster-<NAME >: Redis statefulset
120
- * redis-sentinel-<NAME >: Sentinel deployment
142
+ * redis-sentinel-<NAME >: Sentinel statefulset
121
143
* redis-sentinel-<NAME >: Sentinel service
144
+ * redis-sentinel-headless-<NAME >: Sentinel headless service
145
+ * redis-cluster-<NAME >: Redis headless service
122
146
123
147
Describe the Redis Cluster, Viewing Events and Status
124
148
```
@@ -277,18 +301,23 @@ spec:
277
301
cpu: 50m
278
302
memory: 30Mi
279
303
size: 3
304
+
280
305
# when the disablePersistence set to false, the following configurations will be set automatically:
306
+
307
+ # disablePersistence: false
308
+ # config["save"] = "900 1 300 10"
281
309
# config["appendonly"] = "yes"
282
- # config["auto-aof-rewrite-min-size"] = "1gb "
310
+ # config["auto-aof-rewrite-min-size"] = "536870912 "
283
311
# config["repl-diskless-sync"] = "yes"
284
- # config["repl-backlog-size"] = "60mb"
285
- # config["repl-diskless-sync-delay"] = "5"
312
+ # config["repl-backlog-size"] = "62914560"
286
313
# config["aof-load-truncated"] = "yes"
287
314
# config["stop-writes-on-bgsave-error"] = "no"
315
+
288
316
# when the disablePersistence set to true, the following configurations will be set automatically:
317
+
318
+ # disablePersistence: true
289
319
# config["save"] = ""
290
320
# config["appendonly"] = "no"
291
- disablePersistence: false
292
321
storage:
293
322
# By default, the persistent volume claims will be deleted when the Redis Cluster be delete.
294
323
# If this is not the expected usage, a keepAfterDeletion flag can be added under the storage section
@@ -344,4 +373,37 @@ $ kubectl delete -f deploy/namespace/role.yaml
344
373
$ kubectl delete -f deploy/namespace/role_binding.yaml
345
374
$ kubectl delete -f deploy/service_account.yaml
346
375
$ kubectl delete -f deploy/crds/redis_v1beta1_rediscluster_crd.yaml
347
- ```
376
+ ```
377
+
378
+ ## Automatic failover details
379
+
380
+ Redis-operator build a ** Highly Available Redis cluster with Sentinel** , Sentinel always checks the MASTER and SLAVE
381
+ instances in the Redis cluster, checking whether they working as expected. If sentinel detects a failure in the
382
+ MASTER node in a given cluster, Sentinel will start a failover process. As a result, Sentinel will pick a SLAVE
383
+ instance and promote it to MASTER. Ultimately, the other remaining SLAVE instances will be automatically reconfigured
384
+ to use the new MASTER instance.
385
+
386
+ operator guarantees the following:
387
+ * Only one Redis instance as master in a cluster
388
+ * Number of Redis instance(masters and replicas) is equal as the set on the RedisCluster specification
389
+ * Number of Sentinels is equal as the set on the RedisCluster specification
390
+ * All Redis slaves have the same master
391
+ * All Sentinels point to the same Redis master
392
+ * Sentinel has not dead nodes
393
+
394
+ But Kubernetes pods are volatile, they can be deleted and recreated, and pods IP will change when pod be recreated,
395
+ and also, the IP will be recycled and redistributed to other pods.
396
+ Unfortunately, sentinel cannot delete the sentinel list or redis list in its memory when the pods IP changes.
397
+ This can be caused because there’s no way of a Sentinel node to self-deregister from the Sentinel Cluster before die,
398
+ provoking the Sentinel node list to increase without any control.
399
+
400
+ To ensure that Sentinel is working properly, operator will send a ** RESET(SENTINEL RESET * )** signal to Sentinel node
401
+ one by one (if no failover is being running at that moment).
402
+ ` SENTINEL RESET mastername ` command: they'll refresh the list of replicas within the next 10 seconds, only adding the
403
+ ones listed as correctly replicating from the current master INFO output.
404
+ During this refresh time, ` SENTINEL slaves <master name> ` command can not get any result from sentinel, so operator sent
405
+ RESET signal to Sentinel one by one and wait sentinel status became ok(monitor correct master and has slaves).
406
+ Additional, Each Sentinel instance has default ReadinessProbe script to detect whether the current sentinel's status is ok.
407
+ When a sentinel pod is not ready, it is removed from Service load balancers.
408
+ Operator also create a headless svc for Sentinel statefulset, if you can not get result from ` SENTINEL slaves <master name> ` command,
409
+ You can try polling the headless domain.
0 commit comments