Skip to content

Commit 18ee1f3

Browse files
author
Bharathi Ramana Joshi
committed
DataIntensive till leaderless
1 parent b6ff262 commit 18ee1f3

File tree

1 file changed

+45
-3
lines changed

1 file changed

+45
-3
lines changed

dataIntensive.md

+45-3
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,59 @@
11
# Replication
22
- Many copies of data to scale horizontally and serve more reads
33
- Ways of replication
4-
1. Leader-follower: writes happen first on leader, then propogated to
4+
1. Leader-follower: writes happen first on leader, then propagated to
55
followers.
66
2. Multi-leader
77
3. Leaderless
88
- Types of replication:
99
1. Synchronous: leader sends write and blocks writes till follower ack
1010
2. Asynchronous: leader sends write and continues with further writes
11-
3. Semi-synchronous: one follower is sync, rest async. if leader fails, the
11+
3. Semi-synchronous: one follower is sync, rest async. If leader fails, the
1212
sync follower becomes leader.
13+
## Leader-follower
1314
- Setting up new followers: snapshot leader, launch follower, request changes
1415
since snapshot
1516
- Node outage: catch up using on disk log
1617
- Leader failover:
17-
1.
18+
1. Failover detection: polling. Poll frequency needs to account for load spike.
19+
2. New leader selection: election, controller node. Best candidate is with most writes.
20+
3. Reconfigure system to use new leader: old leader coming back causes split brain.
21+
- Use manual failover to circumvent above subtleties.
22+
- Replication log implementations: sequence of append-only bytes with all writes to db
23+
1. Statement-based: CRUD statements
24+
2. Write-Ahead Log (WAL) based: SStable/LSM Trees track sequence of bytes
25+
being edited in db. Tightly coupled with underlying storage engine.
26+
3. Logical (row) based: rows and columns affected.
27+
4. Trigger-based : execute custom application code (triggers & stored
28+
procedures) when changes to db occur.
29+
- Eventual consistency : after a period of time, all data replicas will converge
30+
to the same state if no updates are made.
31+
- Read-your-writes/Read-after-write consistency : users reads should be immediately visible.
32+
- Monotonic reads: reader shouldn't read older reads after reading newer reads.
33+
- Consistent prefix reads: reads must appear in the same order as writes.
34+
## Multi-leader
35+
- Multi-leader replication: write conflicts
36+
- Conflict avoidance: write application code such that write conflicts do not
37+
occur. E.g. all writes of a user go through the same node.
38+
- Converge writes:
39+
1. Last write wins (LWW)
40+
2. Prioritize nodes, higher priority node's write wins
41+
3. Merge conflicts
42+
4. Record conflict, fix on later read/write (possibly prompting user)
43+
- Multi-leader replication topologies: how should writes propagate in nodes.
44+
1. All-to-all
45+
2. Circular
46+
3. Star/tree
47+
## Leaderless
48+
- Leaderless replication: reads and writes are sent in parallel to many nodes,
49+
quorum response is taken.
50+
- Read repair: if different read values are returned by different nodes, correct
51+
the nodes with wrong read values.
52+
- Anti-entropy process: have background process checking differences in data
53+
across nodes, copy missing data. No guarantees on data propagation delays.
54+
- Common issue:
55+
* Sloppy-quorum: writes and reads ending up on different, non-overlapping nodes
56+
* Write conflicts
57+
* Concurrent reads and writes
58+
* Partially succeeded writes
59+
-

0 commit comments

Comments
 (0)