|
1 | 1 | # Replication
|
2 | 2 | - Many copies of data to scale horizontally and serve more reads
|
3 | 3 | - Ways of replication
|
4 |
| - 1. Leader-follower: writes happen first on leader, then propogated to |
| 4 | + 1. Leader-follower: writes happen first on leader, then propagated to |
5 | 5 | followers.
|
6 | 6 | 2. Multi-leader
|
7 | 7 | 3. Leaderless
|
8 | 8 | - Types of replication:
|
9 | 9 | 1. Synchronous: leader sends write and blocks writes till follower ack
|
10 | 10 | 2. Asynchronous: leader sends write and continues with further writes
|
11 |
| - 3. Semi-synchronous: one follower is sync, rest async. if leader fails, the |
| 11 | + 3. Semi-synchronous: one follower is sync, rest async. If leader fails, the |
12 | 12 | sync follower becomes leader.
|
| 13 | +## Leader-follower |
13 | 14 | - Setting up new followers: snapshot leader, launch follower, request changes
|
14 | 15 | since snapshot
|
15 | 16 | - Node outage: catch up using on disk log
|
16 | 17 | - Leader failover:
|
17 |
| - 1. |
| 18 | + 1. Failover detection: polling. Poll frequency needs to account for load spike. |
| 19 | + 2. New leader selection: election, controller node. Best candidate is with most writes. |
| 20 | + 3. Reconfigure system to use new leader: old leader coming back causes split brain. |
| 21 | +- Use manual failover to circumvent above subtleties. |
| 22 | +- Replication log implementations: sequence of append-only bytes with all writes to db |
| 23 | + 1. Statement-based: CRUD statements |
| 24 | + 2. Write-Ahead Log (WAL) based: SStable/LSM Trees track sequence of bytes |
| 25 | + being edited in db. Tightly coupled with underlying storage engine. |
| 26 | + 3. Logical (row) based: rows and columns affected. |
| 27 | + 4. Trigger-based : execute custom application code (triggers & stored |
| 28 | + procedures) when changes to db occur. |
| 29 | +- Eventual consistency : after a period of time, all data replicas will converge |
| 30 | + to the same state if no updates are made. |
| 31 | +- Read-your-writes/Read-after-write consistency : users reads should be immediately visible. |
| 32 | +- Monotonic reads: reader shouldn't read older reads after reading newer reads. |
| 33 | +- Consistent prefix reads: reads must appear in the same order as writes. |
| 34 | +## Multi-leader |
| 35 | +- Multi-leader replication: write conflicts |
| 36 | +- Conflict avoidance: write application code such that write conflicts do not |
| 37 | + occur. E.g. all writes of a user go through the same node. |
| 38 | +- Converge writes: |
| 39 | + 1. Last write wins (LWW) |
| 40 | + 2. Prioritize nodes, higher priority node's write wins |
| 41 | + 3. Merge conflicts |
| 42 | + 4. Record conflict, fix on later read/write (possibly prompting user) |
| 43 | +- Multi-leader replication topologies: how should writes propagate in nodes. |
| 44 | + 1. All-to-all |
| 45 | + 2. Circular |
| 46 | + 3. Star/tree |
| 47 | +## Leaderless |
| 48 | +- Leaderless replication: reads and writes are sent in parallel to many nodes, |
| 49 | + quorum response is taken. |
| 50 | +- Read repair: if different read values are returned by different nodes, correct |
| 51 | + the nodes with wrong read values. |
| 52 | +- Anti-entropy process: have background process checking differences in data |
| 53 | + across nodes, copy missing data. No guarantees on data propagation delays. |
| 54 | +- Common issue: |
| 55 | + * Sloppy-quorum: writes and reads ending up on different, non-overlapping nodes |
| 56 | + * Write conflicts |
| 57 | + * Concurrent reads and writes |
| 58 | + * Partially succeeded writes |
| 59 | +- |
0 commit comments