./dotest.sh TestElectionFollowerComesBack- higher term = higher priority
- does not send any messages
- only replies to messages from leader or candidate
- starts election by incrementing term and voting for itself
- sends RequestVote RPC to all other servers
- waits for votes from majority of servers
- if it receives votes from majority, it becomes leader
- if it receives RequestVote RPC from a higher term, it becomes follower
- sends AppendEntries RPC to all followers (heartbeat or log replication)
- client sends command to leader
- leader appends command to its log
- leader sends AppendEntries RPC to followers with new log entry
- followers append entry to their log and send back success response
- leader waits for majority of followers to respond
- if majority respond, leader commits the entry and sends CommitEntry to commitChan
- leader sends next log entries to followers
- leader sends updated commit index to followers, who will then mark these entries as committed and will send them on the commit channel
- prevent a candidate from winning an election unless its log is at least as up-to-date as a majority of peers in the cluster
- RV args:
lastLogIndex,lastLogTerm, followers compare these fields to their own and decide whether the candidate is sufficiently up-to-date to be elected - this also prevents
runawayleaders from being elected, i.e. leaders that have no log entries or have stale log entries but higher term (e.g. they are separated due to network partition then rejoin)
- needs only to persist currentTerm, votedFor, and log entries
- Raft =
at least oncedelivery - commands should have unique IDs to prevent duplicates
- kvclient sends HTTP request to kvservice
- tries all services until it gets a response from leader service
- kvservice handles HTTP request: get, put, cas
- submits directly to raft log (if it's leader)
- if not leader, returns error
- kvservice runs updater goroutine to update datastore when raft commits
- kvclient waits for commit on the subscription channel
- if it's our command, all is good, else = lost leadership, return error to client
Note: get, put, cas are idempotent, so it's safe to retry
suppose we want to add append command, which is not idempotent
- currently, client retries command on many services, so it's not exactly once
- solve by adding serial no. to each command
- at most once: logs, telemetry
- added client retries, becomes at least once
- at least once
- added de-duplication (command ID), becomes exactly once
- exactly once
If you can get at-least-once delivery, you can build exactly-once on top of that
even w/o append, we still should ensure exactly-once delivery e.g. to prevent lost updates
