You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sites/wonderland/docs/testing/advanced-testing/formal-verification.md
+19-19Lines changed: 19 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Formal verification
2
2
3
-
While all other tests aim to find bugs using specific values (such as 123, address(69), etc.), even when guided by coverage like Medusa, they can only demonstrate the presence of a bug, not its absence. Formal verification, on the other hand, allows the opposite. It provides mathematical certainty by proving the correctness of the system being tested - essentially confirming the absence of bugs, *given the properties being assessed*.
3
+
While all other tests aim to find bugs using specific values (such as 123, address(69), etc.), even when guided by coverage like Forge or Medusa, they can only demonstrate the presence of a bug, not its absence. Formal verification, on the other hand, allows the opposite. It provides mathematical certainty by proving the correctness of the system being tested - essentially confirming the absence of bugs, *given the properties being assessed*.
4
4
5
5
## Symbolic Execution
6
6
@@ -18,7 +18,7 @@ function abs(int256 x) public returns(int256) {
18
18
```
19
19
20
20
We can rewrite it as the following logic formula:
21
-
`x < 0 implies result=-x (1) AND x >= 0 implies result=x (2)`
21
+
`x < 0 implies result=-x (1) AND x >= 0 implies result=x (2)`
22
22
23
23
And now add the following property we’d like to prove, which is the intended behaviour of an abs function, ie return positive values only: `assert(result >= 0); (3)`
24
24
@@ -47,18 +47,18 @@ A simplified way the SAT solver would solve this is (DLLP(T), as z3 would use fo
47
47
- if there is no other unit clause, chose another literal: try setting b1 to false -> it doesn't change others clauses
48
48
- try b3 true -> b4 needs to be true then (!b3 || b4 is therefore true) -> this is solved with (b1 F, b2 T/F, b3 T, b4 T)
49
49
50
-
After having solved this boolean formula, the solution is “fed” back to the theory solver, which basically starts from the solved statement, and check if "it makes sense". In the example above this would mean converting back to arithmetic and look for impossible results.
50
+
After having solved this boolean formula, the solution is “fed” back to the theory solver, which basically starts from the solved statement, and check if "it makes sense". In the example above this would mean converting back to arithmetic and look for impossible results.
51
51
52
52
> b1 F means x >= 0
53
53
b2 can be whatever, result is -x or is not
54
54
b3 T means x >= 0
55
55
b4 T means result == x
56
56
b5 T means result >= 0
57
-
>
57
+
>
58
58
59
59
this system can be simplified as `(x >= 0) and (result = x) and (result >= 0)`, which gather no contradiction/the system is satisfiable and we’ve proved it (we can produce examples, like x = 1 and result = 1 for instance), the solver will return SAT.
60
60
61
-
One of the most used SMT solver/prover is [Z3](https://github.com/Z3Prover/z3/wiki#background), built by Microsoft. It is the default solver for solc, Halmos or some parts of Kontrol.
61
+
One of the most used SMT solver/prover is [Z3](https://github.com/Z3Prover/z3/wiki#background), built by Microsoft. It is the default solver for solc, Halmos or some parts of Kontrol.
62
62
63
63
## Concolic execution
64
64
@@ -77,13 +77,13 @@ In practice, each run starts with a concrete value (initially random, within the
77
77
78
78
```solidity
79
79
function tryMe(uint256 x) public {
80
-
uint256 y = x + 2; // (1)
81
-
uint256 z;
82
-
83
-
if (y > 5) z = 15;
84
-
else z = 1;
80
+
uint256 y = x + 2; // (1)
81
+
uint256 z;
82
+
83
+
if (y > 5) z = 15;
84
+
else z = 1;
85
85
86
-
if(y != 0) emit Heyaaaa();
86
+
if(y != 0) emit Heyaaaa();
87
87
}
88
88
```
89
89
@@ -99,7 +99,7 @@ The next step is where we actually leverage symbolic execution (until now, we on
99
99
100
100
In our example, if we are on the depth-first side of the force, we’ll negate the latest predicate, and then “going up” (ie first negate `(αx + 2 != 0)`, explore, and only after we don’t have new paths condition, negate `(αx + 2 > 5)`).
101
101
102
-
This comes with one of the key optimization of concolic execution: as all predicates *preceding* the one we negate are still valid in that path (eg our path condition is `P1 and P2 and P3 and P4` , if we mutate P3, we cannot keep P4 - we don’t know anything about that predicate anymore - *but* P1 and P2 are still part of that path condition and are satisfiable), our solver can reuse these (in 2 ways: either reusing this prefix as is, the space to explore being reduced; either via ”incremental solving” where the solver keep track of previous assertions).
102
+
This comes with one of the key optimization of concolic execution: as all predicates *preceding* the one we negate are still valid in that path (eg our path condition is `P1 and P2 and P3 and P4` , if we mutate P3, we cannot keep P4 - we don’t know anything about that predicate anymore - *but* P1 and P2 are still part of that path condition and are satisfiable), our solver can reuse these (in 2 ways: either reusing this prefix as is, the space to explore being reduced; either via ”incremental solving” where the solver keep track of previous assertions).
103
103
104
104
In other words, for any given simple path condition with n predicates (simple as in “straight line”, without loop or subbranches), the number of time we call the solver is in O(n) while the global “cost of solving” (ie how much we need to solve new predicates/we cannot reuse previous ones) in O(n^2) (ie 1 + 2 + …+ n = n(n+1)/2 which is n^2 complexity). In comparison, a symbolic execution based solving would solve every branches, from scratch (meaning O(2^n) calls to the solver, each having the same n predicates to solve (we don’t reuse), or a O(n x 2^n) global cost).
105
105
@@ -117,7 +117,7 @@ K is a framework for formally defining a programming languages or hardware. Kont
117
117
118
118
To do so, Kontrol rewrite what’s called the *configuration* (the state - ie memory, storage, etc - organised in nested cells) using the language semantic defined in K. It then rewrite the reachability claims themselves, to check if any configuration satisfying the pre-conditions can be rewritten into one satisfying the post-condition ("under these assumptions, all executions must eventually satisfy this condition").
119
119
120
-
As Kontrol isn't building path constraints to solve them (as a symbolic execution engine would), but rather focus on working directly on the semantic itself, it allows handling things which are usually hard for solvers (unbounded loops for instance). For specific, non-trivial, properties, Kontrol supports *auxiliay lemmas*, allowing providing intermediate results, to "guide" the proof.
120
+
As Kontrol isn't building path constraints to solve them (as a symbolic execution engine would), but rather focus on working directly on the semantic itself, it allows handling things which are usually hard for solvers (unbounded loops for instance). For specific, non-trivial, properties, Kontrol supports *auxiliay lemmas*, allowing providing intermediate results, to "guide" the proof.
121
121
122
122
## Tools
123
123
@@ -129,9 +129,9 @@ ItyFuzz is currently the most used concolic execution engines (working with a se
129
129
130
130
## Kontrol
131
131
132
-
Kontrol relies on an implementation of the EVM bytecode semantic in K (the KEVM).
132
+
Kontrol relies on an implementation of the EVM bytecode semantic in K (the KEVM).
133
133
134
-
See the Kontrol docs for a basic test example https://docs.runtimeverification.com/kontrol/guides/kontrol-example/property-verification-using-kontrol
134
+
See the Kontrol docs for a basic test example [https://docs.runtimeverification.com/kontrol/guides/kontrol-example/property-verification-using-kontrol](https://docs.runtimeverification.com/kontrol/guides/kontrol-example/property-verification-using-kontrol)
135
135
136
136
### Notes
137
137
@@ -148,11 +148,11 @@ This drastically increase its’ speed for proving properties, and makes it even
- More in-depth presentation by Juan Conejero at EthCC this year: [https://ethcc.io/archive/Getting-started-with-Kontrol-a-formal-verification-tool](https://ethcc.io/archive/Getting-started-with-Kontrol-a-formal-verification-tool)
153
-
- Here is an earlier presentation at EthCluj, of both Simbolik and Kontrol: https://ethcc.io/archive/Getting-started-with-Kontrol-a-formal-verification-tool
153
+
- Here is an earlier presentation at EthCluj, of both Simbolik and Kontrol: [https://ethcc.io/archive/Getting-started-with-Kontrol-a-formal-verification-tool](https://ethcc.io/archive/Getting-started-with-Kontrol-a-formal-verification-tool)
154
154
-[A more advanced example](https://runtimeverification.com/blog/using-kontrol-to-tackle-complexities-caused-by-dynamically-sized-constructs) of how to use loop invariants in Kontrol for handling unbounded inputs, something unique to Kontrol's capabilities.
155
155
-[A Wonderland podcast](https://drive.google.com/file/d/1KusMkjKsDRe0FV3TFkNr_6jOWxJ0QF6Z/view?usp=sharing) with Palina and Andrei from RuntimeVerification presenting Kontrol - see the kcfg part
156
-
-https://www.youtube.com/watch?v=9PLnQStkiUo for a more theoretical talk about K itself, and how Kontrol is built on top of it
156
+
-[https://www.youtube.com/watch?v=9PLnQStkiUo](https://www.youtube.com/watch?v=9PLnQStkiUo) for a more theoretical talk about K itself, and how Kontrol is built on top of it
157
157
-[OP CI Kontrol tests](https://github.com/ethereum-optimism/optimism/tree/develop/packages/contracts-bedrock/test/kontrol)
158
-
-[WETH9 testing](https://github.com/horsefacts/weth-invariant-testing/blob/main/test/WETH9.symbolic.t.sol) with Halmos
158
+
-[WETH9 testing](https://github.com/horsefacts/weth-invariant-testing/blob/main/test/WETH9.symbolic.t.sol) with Halmos
We take an opiniated approach of having targeted fuzzing campaign and formal verification. To maximize the efficiency (ie benefit/cost), we put a bigger emphasis on *what* we test than on *how* we do it - finding and formalizing the most important invariants being a key objective to do so.
3
+
We take an opiniated approach of having targeted fuzzing campaign and formal verification. To maximize the efficiency (ie benefit/cost), we put a bigger emphasis on *what* we test rathen than on *how* we do it - finding and formalizing the most important invariants being a key objective to do so.
4
4
5
5
We define *good* invariant as:
6
+
6
7
- Informative: cover a behaviour not or only partially covered by existing tests (unit or integration) - they bring new information (for instance, an invariant testing if `require(sender == owner)` reverts when the sender is not the owner brings no new information compared to an unit test).
7
-
- Realistic: the pre-condition must be something which is expected or at least achievable without breaking, for instance, EVM invariants with cheatcodes.
8
+
- Realistic: the pre-condition must be something which is expected or at least achievable without breaking, for instance, EVM invariants using cheatcodes.
8
9
9
-
To find such, we start by establishing a protocol accounting (assets, liabilities and operations) which we then complete with others, non-accounting ones.
10
+
To find such, we suggest starting by establishing a protocol accounting (assets, liabilities and operations), list the related invariants, then complete with any other (non-accounting) ones.
10
11
11
12
## Establish protocol accounting
12
13
13
14
We’ll use the following example throughout this doc:
15
+
14
16
- WonderPump is a novel ICO protocol
15
17
- It has a single contract, where Grifters can provide a type of tokens and users can send ETH to get it, following a pricing function.
16
18
- The protocol is getting a fee on each sale, in ETH, which can be redeemed via a dedicated function.
17
19
18
-
The accounting invariants should cover every actor’s balance and how it evolves through time (ie WonderPump token balance should be the sum of deposit minus token sold). A small, simplified, balance sheet helps doing this, as well as a summary of the accounting operations (this will already cover most of the ghost variables used):
20
+
The accounting invariants should cover every actor’s balance and how it evolves through time (ie WonderPump token balance should be the sum of deposit minus token sold). A small, simplified, balance sheet helps doing this, as well as a summary of the accounting operations (this will already tell us most of the ghost variables we need):
19
21
20
-
- On the left side, list all the protocol liabilities (the token which are still in balance but “owned” to someone → a protocol fee) and “equity” (basically, how were the asset funded) → the main characteristic is these token are in balance but are not “accessible” anymore
22
+
- On the left side, list all the protocol liabilities (the token which are still in balance but “owned” to someone → a protocol fee) and “equity” (basically, how were the asset funded) → the main characteristic is these token are in protocol's balance but are not “accessible” anymore
21
23
- On the right side, list all the protocol assets (the token the protocol can “use” to do something) → these are the “tools” of the protocol (eg the token deposited by Grifter)
22
-
- As in traditional accounting, at each moment in time, both side should be equals (ideally - as there are sometimes some shortcut taken, this is a weak condition) or at least assets should be greater than liabilities (strong condition)
24
+
- As in traditional accounting, at each moment in time, both side should be equals (ideally - as there are sometimes some shortcut taken, this is a weak condition) or at least assets should be greater than liabilities (strong condition, as it is the protocol's solvability)
23
25
- The “accounting operation” are then describing the relation between all the balances (eg “a token deposit should increase the contract balance and the amount available to sell”)
24
26
- Here is WonderSwap accounting - if you have an accounting background, you’d notice it is only schematic:
25
27
@@ -33,23 +35,22 @@ And the related operations:
33
35
34
36
- A deposit from grifter should increase token deposit and token to sell (ACC-2)
35
37
- A sale should: (ACC-3)
36
-
- Lower token deposited and token to sell
37
-
- Increase eth for grifter, for the protocol fee and the eth in balance
38
+
- Lower token deposited and token to sell
39
+
- Increase eth for grifter, for the protocol fee and the eth in balance
38
40
- A fee withdrawn should lower eth protocol fee and eth in balance (ACC-4)
39
41
- Grifter withdraw should lower eth for grifter and eth in balance (ACC-5)
40
42
41
43
From there, the invariant are making sure these operations are properly influencing the balance and that, at all point in time, assets == liabilities
42
44
43
45
| ACC-1 | Assets == liabilities (token deposit == token to sell, and eth for grifter + eth protocol fee == eth in balance) |
44
-
| --- | --- |
45
-
| ACC-2 | see above |
46
+
| ACC-2 | A deposit from grifter should increase token deposit and token to sell |
46
47
|| (…) |
47
48
48
-
One can easily check how most of the protocol behaviour and general “risky” invariants are already covered (eg no free mint, no fund stuck, etc)
49
+
One can easily check how most of the protocol behaviour and critical invariants are already covered (eg no free mint, no fund stuck, no dust, etc)
49
50
50
51
## List non-duplicates
51
52
52
-
After this initial phase, the rest of the invariants will try to uncover any (un)expected behaviour, while making sure to not test something already implemented in the unit tests (eg “only Grifter can withdraw” has no interest if this is a onlyOwner modifier or similar for instance, as fuzzing will not cover any new logic). The emphasis should therefore be on “broader picture”, in other terms, stateful invariants, which requires a succession of transactions, each influenced by the state left by the previous one.
53
+
After this initial phase, the rest of the invariants will try to uncover any (un)expected behaviour, while making sure to not test something already implemented in the unit tests (eg “only Grifter can withdraw” has no interest if this is a onlyOwner modifier or similar for instance, as fuzzing will not cover any new possible logic). The emphasis should therefore be on “broader picture”, in other terms, stateful invariants, which requires a succession of transactions, each influenced by the state left by the previous one.
53
54
54
55
One noticeable exception are invariants around arithmetic (see infra).
55
56
@@ -58,14 +59,14 @@ One noticeable exception are invariants around arithmetic (see infra).
58
59
- Non-reversion invariants: after writing handler and properties around a call, it is often useful to add post-condition on the revert case (ie make sure every revert is explained)
59
60
- Protocol shutdown: With the accounting invariants in place, one powerful way to test them is to have a shutdown mechanism, which will unwind the whole protocol (pausing, emptying balances, etc), making sure no funds are stuck and all parties are made whole.
60
61
61
-
### Arithmetic:
62
+
### Arithmetic
62
63
63
-
Having tightly coupled invariant is, quite obviously, useless. One way to test complex arithmetic is to challenge its mathematical properties instead. These invariants should be reserved when the implementation is not straightforward (assembly or huge complexity).
64
+
Having tightly coupled invariant is, quite obviously, useless (ie pasting the implementation in a test). One way to test complex arithmetic is to challenge its mathematical properties instead. These invariants should be reserved when the implementation is not straightforward (assembly or huge complexity).
64
65
65
66
example: x*y
66
67
67
68
- this should be commutative: x*y == y*x
68
69
- associative: (x*y*)*z == x*(y*z)
69
70
- 1 is the neutral element: if y = 1, x == k
70
71
- 0 is the absorbing element: if y = 0, x == k == 0
0 commit comments