Skip to content

Latest commit

 

History

History
137 lines (85 loc) · 4.9 KB

File metadata and controls

137 lines (85 loc) · 4.9 KB

Database Replication

Definition

  • Database Replication is the process of copying and maintaining database objects, such as tables or entire databases, across multiple servers to improve availability, fault tolerance, and read scalability.
  • Replicas (or slave nodes) can serve read queries, while the primary (master node) handles write queries.

Why Replication is Needed

  • High Availability: Ensures system uptime even if one node fails.
  • Load Distribution: Offloads read queries from the master to replicas.
  • Disaster Recovery: Maintains copies of data across locations.
  • Data Locality: Place replicas closer to users to reduce latency.

Types of Replication

1. Master-Slave Replication

  • Mechanism:

    • One master node handles all writes.
    • One or more slave nodes replicate data from the master, usually asynchronously.
    • Slaves handle read queries to reduce load on the master.
  • Advantages:

    • Simple to set up.
    • Improves read scalability.
  • Disadvantages:

    • Master is a single point of failure for writes.
    • Replication lag can cause read-after-write inconsistency.
  • Use Cases:

    • Websites with high read-heavy traffic, e.g., blog platforms.

2. Master-Master Replication

  • Mechanism:

    • Multiple nodes act as masters and handle both read and write queries.
    • Each master synchronizes data with others to maintain consistency.
  • Advantages:

    • High availability for both reads and writes.
    • No single point of failure for writes.
  • Disadvantages:

    • Conflict resolution required for concurrent writes.
    • More complex setup and maintenance.
  • Use Cases:

    • Applications needing global writes, e.g., distributed user accounts, collaborative platforms.

3. Synchronous vs. Asynchronous Replication

Type Description Pros Cons
Synchronous Write is committed only after all replicas confirm. Strong consistency, safe writes. Slower, higher latency.
Asynchronous Master commits immediately; replicas update later. Fast writes, better performance. Risk of stale reads, potential data loss if master fails.

4. Semi-Synchronous Replication

  • Mechanism: Master waits for at least one replica to acknowledge the write before committing.
  • Advantages: Balances latency and consistency.
  • Use Cases: Systems where some read-after-write consistency is needed without full synchronous overhead.

Replication Topologies

  1. Single Master, Multiple Slaves

    • Simple, common for read-heavy workloads.
    • Example: MySQL replication setup.
  2. Multi-Master Cluster

    • All nodes can handle reads/writes; conflicts resolved via timestamps or versioning.
    • Example: CouchDB, Galera Cluster.
  3. Ring Replication

    • Nodes replicate to the next node in a ring; used in distributed systems like Cassandra.
  4. Star Replication

    • Central master replicates to multiple slaves; slaves don’t replicate to each other.

Challenges in Replication

  • Replication Lag: Slaves might not have the latest data immediately.
  • Conflict Resolution: Needed in multi-master setups.
  • Data Consistency: Ensuring eventual or strong consistency.
  • Network Partitioning: Can cause split-brain scenarios.
  • Monitoring & Failover: Detecting failed nodes and promoting replicas to masters.

Best Practices

  • Monitor replication lag and alerts.
  • Use semi-synchronous replication for critical writes.
  • Implement automated failover for high availability.
  • Choose replication topology based on read/write ratio and geographical distribution.
  • Regularly backup data, even with replication.

Summary Table

Replication Type Read Scalability Write Scalability Consistency Complexity
Master-Slave High Low Eventual Low
Master-Master High High Depends on conflict resolution High
Synchronous Moderate Low Strong Moderate
Asynchronous High High Eventual Low
Semi-Synchronous High Moderate Mostly strong Moderate