Setup

Run commands

make build: Builds the project
make node: Runs the node http server at port 3000. Builds the project as well
make job: (after make node) Runs job controller to load the epaxos library, connect to node and starts epaxos
make pre_commit: Runs all the CI/CD checks (formatting, linting, etc)

Graphs generation

Edit epaxos.toml to set client workload parameters like target_rps
Run make node_log LOG="logfile_{rps}rps.log" here fill rps with whatever integer rps value you set in toml.
- This will make the logs in logs/ dir
Repeat the above for different rps values
Run python3 analyze_results.py. Requires matplotlib.

CI/CD check commands

# Check formatting issues
cargo fmt --all -- --check

# Run clippy (linter) for post-compilation code suggestions and fixes
cargo clippy -- -D warnings

Auto-fix commands for the above checks

# Format entire codebase
cargo fmt --all

# Attempts to fix the issues suggested by clippy earlier (doesn't do all the fixes)
cargo clippy --fix --allow-dirty --allow-staged -- -A warnings

System design

Message types

Client messages (see common.rs for definitions)
- ClientRequest (wraps Command::Get or Command::Set)
- ClientResponse (wraps CommandResult::Get or CommandResult::Set)
Base server messages
- PreAccept
- PreAcceptOk
- Accept
- AcceptOk
- Commit
Failure messages
- Prepare
- PrepareOk

Whats left:

does execution require message type?

Server state variables

`cmds`

Struct fields for each array element:
- Command ($\gamma$) - The write operation
- Sequence number seq
  - To break cyclic dependencies
  - ???
- Dependency list deps
  - List of (replica,instance) pairs whose commands interferes with gamma
- Status - variants:
  - PreAccepted
  - Accepted
  - Committed

`instance_num`

Counter which stores latest instance number
Incremented each time I get a write request

`replica_list`

List of all active replicas

New possible additions

`quorum_ctr`

1d vector indexed by instance number. stores int counter for quorum checking
Used for PreAcceptOk, AcceptOk

`app_meta`

Actually this might be needed in cmds. In explicit prepare, other replicas reply to client on behalf of failed leader

Application layer metadata for each instance. 1d vector indexed by instance number
consists of:
- client_id - which client sent this msg
- msg_id - assigned by client

Intermediate data types

Instance
- fields: 2
  - replica
  - inst_num
- Used in Interf set and other places
Command
- variants:
  - Set(id, var, val) (Write operation)
  - Get(id, var) (Read operation)
  - any other commands we're gonna support
Response
- variants:
  - Set(id, var) (Write response)
  - Get(id, var, val) (Read response)
  - any other commands we're gonna support

Actor initialization variables

What to put in payload in toml file

replica name - my own name
list of all replicas names

Message behaviors

Template: For each message:

Message payload
What state variables are being modified
What is the output message
Where to send output message (any state required for this)
Invariants

Base server receiving messages

ReadRequest

similar to write request, but once we reach commit phase for this read command, we run the execution algorithm

WriteRequest

Payload

command ($\gamma$)
client_id
msg_id

State modifications

Local variables:
- Interf - set of instances whose commands interfere with the current command
- seq - update (??)
- deps - update (??)
- L - myself, the command leader replica for this command
instance_num ($i_L$) (?) - increment
cmds[L][$i_L$] - ($\gamma$, seq, deps, PreAccepted, 0)

What is the output

PreAccept payload: ($\gamma$, seq, deps, $Inst(L,i_L)$)

Send to

Atleast fast quorum minus self ($F \setminus L$)
For simplicity, broadcast to all servers

Invariants

??

PreAccept

Payload

command ($\gamma$)
sequence number seq
dependency list deps
Instance $(L, i_L)$

State modifications

Local variables:
- seq - update (??)
- deps - update (??)

cmds\[L\]\[$i_L$\] - ($\gamma$, seq, deps, PreAccepted, 0)

Output

PreAcceptOk payload: ($\gamma$, seq, deps, $Inst(L,i_L)$)
- here, can avoid sending $\gamma$. Leader already has it

Send to

Command leader L, where PreAccept came from

Invariants

??

PreAcceptOk

Payload

command ($\gamma$)
sequence number seq
dependency list deps
Instance $(L, i_L)$
command leader L
instance number at leader i_L

Phase 1

Local variables:
- ctr - number of PreAcceptOk msgs recieved
- L - myself, the command leader
- seq - check eq with cmds[L][$i_L$].seq
- deps - check eq with cmds[L][$i_L$].deps
Check if the msg hasn't already been accepted or committed else ignore
If eq:
- cmds[L][$i_L$] - ($\gamma$, seq, deps, PreAccepted, ctr+1)
If not eq (else): (conflicts seen)
- Update seq - max()
- Update deps - union()
- cmds[L][$i_L$] - ($\gamma$, seq, deps, Accepted, ctr+1)
If ctr == $\lfloor N / 2 \rfloor$
- if conflicts seen (cmds shows Accepted):
  - go to Phase 2
- if no conflicts seen so far (cmds shows PreAccepted):
  - wait for getting fast quorum
  - check if fast quorum == majority, if no, return fn and wait for more msgs. If yes, continue in code logic, don't return yet.
If ctr == fast quorum $F\setminus L$
- if no conflicts seen so far (cmds shows PreAccepted):
  - go to Commit phase
- if conflicts seen (cmds shows Accepted):
  - Not possible, quorum intersection invariant violated, panic

Commit phase

Fast path quorum $F\setminus L$ reached cmds[L][$i_L$] - ($\gamma$, seq, deps, Committed, ctr+1)
Output:
- Commit payload: ($\gamma$, seq, deps, $Inst(L,i_L)$)
  - Where to send:
    - Broadcast to every replica
- WriteResponse payload:

Phase 2 - Paxos-Accept

Output:
- Accept payload: ($\gamma$, seq, deps, $Inst(L,i_L)$)
Where to send: Broadcast to every replica

Invariants

Any messages we get after we've moved on to a phase (commit or accept) should not have any new conflicts.
- Proof: Through quorum intersection, since every previous committed conflicting command involves atleast a quorum, all those quorums for each command should intersect with the current quorum. So through a quorum we must've seen all possible conflicting commands
Leader should receive PreAcceptOk only after it processed and sent PreAccept for the same command to others
- I am the command leader for this instance, can't get the PreAcceptOk of some other command
In fast path: Leader cannot see a new PreAcceptOk for this command with conflicts after already seeing majority and proceeding to next phase.
In slow path: Leader cannot see a new PreAcceptOk for this command with newer conflicts after already seeing majority and proceeding to next phase. We can move forward to phase 2 after majority is formed.
- Logic: Thats the guarantee of majority quorum intersection
- Need to walk through a scenario to confirm this

Assumptions

The leader proceeds to the next phase (Accept or Commit) after receiving PreAcceptOk from majority

Accept

Payload

command ($\gamma$)
sequence number seq
dependency list deps
Instance $(L, i_L)$

State modifications

cmds\[L\]\[$i_L$\] - ($\gamma$, seq, deps, Accepted, 0)

Output

AcceptOk payload: ($\gamma$, $Inst(L,i_L)$)
- here, can avoid sending $\gamma$. Leader already has it similar to preAcceptOk

Send to

L, where it came from

AcceptOk

Payload

command ($\gamma$)
Instance $(L, i_L)$

We run commit phase

State modifications

cmds[L][$i_L$] - ($\gamma$, seq, deps, Committed, 0)

Output

If ctr == $\lfloor N / 2 \rfloor$
- Commit payload: ($\gamma$, seq, deps, $Inst(L,i_L)$)

Send to

Broadcast to every replica

Commit

Payload

command ($\gamma$)
sequence number seq
dependency list deps
command leader L
instance number at leader i_L

State modifications:

cmds[L][$i_L$] - ($\gamma$, seq, deps, Committed, 0, L)

Additional code logic

If there's a read request whose leader myself R is waiting for this commit, continue the execution algorithm and if need to wait for another commit, then move on. Otherwise, if execution is finished, give read response to client

Output

possibly read response

Invariants:

??

Paper variables

Q - replica
Q.i - A single instance, whose command leader is replica Q, and it's corresponding instance number i

Paper ambiguities

When waiting for PreAcceptOk for majority,
- Should checking for conflicts be done for fast path quorum? Ans: Yes. if conflict, go slow path Or should we check for all responses received? Ans: No, only for quorum
- Rephrasing: should we move on immmediately after we get fast path quorum or wait for all responses? Ans: Move on
  - What do we do for later responses if we move on? Ans: Ignore. quorum intersection guarantees we won't see any new useful information from new replies
In the algorithm they mention sending gamma even in the "Ok" messages (PreAcceptOk, AcceptOk)
- But leader already has gamma, so is it necessary?
- Maybe for failure recovery?
For quorum sizes, In the algo, usage of "waiting for $\lfloor N / 2 \rfloor$ Ok responses" probably means excluding the self from that count. so in total, $\lfloor N / 2 \rfloor$ + 1 replicas are aware of the same info. so quorum intersection is guaranteed.
- ASSUMPTION: We are actively assuming that the self is excluded in such a count. that a replica doesn't send PreAccept/PreAcceptOk messages to itself.

Paper deviations

We store quorum counter and command leader in cmds state.
When receiving PreAcceptOk,
- If we see any conflict, before quorum itself, we update state to Accepted prematurely before going to Phase 2 section in paper algorithm. Because presence of conflicts ensures we will go to phase 2 anyway.

Possible optimizations

in all the *Ok messages, no need to send $\gamma$.

To discuss

what are the commands we support.
walk through and define how conflicts work. I expect instance numbers to conflict between two command leaders
In what situations do we have deps conflict instance where cmds shows blank for that instance? I know this conflicts with me but I don't have it's value yet to execute. It's possible anyways, the paper talks about it.
- When some of the commit message broadcasts fail. So there may be a set of replicas who do not have a particular committed command. Now suppose one such replica received a request. As the command leader doing preaccept, in one of the preacceptok responses I'll see that there is a conflict (and hence do slow path). But the preacceptok response will only have the conflicting instance, not the command stored in it. And I can't get it from my local store of cmds either because as established earlier, I missed that commit message of the earlier command broadcast, so I do not have it stored with me.
  - If commit broadcast guarantees that everyone receives it, we will never have missing values: true or false?
    - true, because for each committed instance, everyone will receive that value eventually.
    - false if you're considering message delays. Then yes in that time frame before commit msg reaches everyone, we can have missing values

About conflicts and executions

Need to define clearly what is a conflict.
- Is it when different commands on same instance num for diff replica? No
- Is it two commands interacting with the same state variable? regardless of which instance,replica? Yes
- When looking for previous conflicting instances, do we consider only committed instances only or incomplete(pre-accept/accept) requests as well?
  - In execution algorithm it mentions "waiting for R.i to be committed". This implies that dependencies are formed on instances before they are even committed.
When are conflicts considered to be resolved? After resolved (in the execute phase perhaps), are they updated in the state to no longer cause conflicts for newer commands?
- Theoretically for a limited set of keys, after enough requests, every single new request afterwards will have atleast one conflicting older instance and hence be forced to go to slow path. This is clearly not ideal. So must understand how conflicts are "cleared".
  - CLARIFICATION: There's a misunderstanding that existence of conflicts/dependencies are what forces slow path. Which is wrong. Finding previous conflicts/dependencies within command leader storage is fine actually. Only when we find new conflicts that leader hasn't seen before, do we go for slow path.
Instance numbers give a clear ordering. Does this mean conflicts are only relevant if they have same instance number? (diff replicas)

More invariants to test/consider

For any set of interfering instances, are their sequence numbers unique and continuously increasing?

Realizations and clarifications

Instance number is local to each replica. By itself it has no global cross-server meaning. So you can't use it as some kind of ordering number
- For example, every replica starts with inst num 0. R1 gets command, its set as inst num 1, sends to everyone, fast path, committed, all good. Now if R2 gets a new command, its inst num is still 1. Just that commands are assigned via (Replica, inst_num) pairs, not with inst num alone. So it's totally normal and expected in cmds table to have different commands in the same inst num column.
- Situation to support why this is good:
  - If you forced instance num to be unique for every command received by the system as a whole, if there are two concurrent writes happening with same inst num (possible with two diff replicas getting req at the same time), then one is bound to fail/rejected, and that write request is considered to fail. So in all scenarios of concurrent requests, we'll get a lot of write failures which is bad.
For $Interf_{L,\gamma}$, definition is: Set of instances $Q.j$ such that commands in cmds_L[Q][j] inferferes with $\gamma$.
- Note that the definition has nothing to do with the current command leader L's instance num $i_L$. So we literally do not care about any positioning in the cmds state. instance number being same has nothing to do with interference, which we already know as stated before.
- How to find interfering commansd
What conflicts mean, case by case:
- For concurrent writes, if the commands arent semantically linked (totally different state variables) or just literally commutative, we don't have to care. Both are non-conflicting writes and we can commit both in fast path without issues. Ordering we don't have to care.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
lcheck		lcheck
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Makefile		Makefile
README.md		README.md
analyze_results.py		analyze_results.py
epaxos.toml		epaxos.toml
requirements.txt		requirements.txt

Brian-1402/EPaxos_reactor

Folders and files

Latest commit

History

Repository files navigation

Setup

Run commands

Graphs generation

CI/CD check commands

Auto-fix commands for the above checks

System design

Message types

Server state variables

cmds

instance_num

replica_list

quorum_ctr

app_meta

Intermediate data types

Actor initialization variables

Message behaviors

Base server receiving messages

ReadRequest

WriteRequest

Payload

State modifications

What is the output

Send to

Invariants

PreAccept

Payload

State modifications

Output

Send to

Invariants

PreAcceptOk

Payload

Phase 1

Commit phase

Phase 2 - Paxos-Accept

Invariants

Assumptions

Accept

Payload

State modifications

Output

Send to

AcceptOk

Payload

State modifications

Output

Send to

Commit

Payload

State modifications:

Additional code logic

Output

Invariants:

Paper variables

Paper ambiguities

Paper deviations

Possible optimizations

To discuss

About conflicts and executions

More invariants to test/consider

Realizations and clarifications

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 4

Uh oh!

Languages

`cmds`

`instance_num`

`replica_list`

`quorum_ctr`

`app_meta`