Bounded-scope generators #3014
konnov
started this conversation in
Ideas + Research
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Apalache currently supports value generators to restrict the scope of very large data structures.
Why we need value generators
When we want to find bugs in a protocol with a very large state space and plenty of behavior, e.g., Byzantine consensus, we would like to start model checking in a "reasonable" intermediate state (more on that later). Otherwise, both TLC and Apalache get stuck after a exploring the state space/execution space up to 10-20 steps.
The standard answer to this problem is to introduce an inductive invariant, prove its inductiveness and use it to start exploration in an arbitrary state that is described by the inductive invariant. Actually, with an inductive invariant, we need just need to explore one step. For instance, this is how I did it for Ben-Or's consensus recently, see IndInit (for the fixed constants):
Basically, there are two components in
IndInit:TypeOKrestricts the scope of the state variables to the reasonable values.IndInvcaptures the essential relations between the variables, specific to the protocol logic. This predicate is usually quite hard to come up with.This approach technically works both in TLC and Apalache. However, it's easy to see that
TypeOKcontains potentially very large sets on the right-hand side of the membership relation, e.g.:When we increase the cardinality of the sets
ALL,ROUND, andVALUES, the setSUBSET [ src: ALL, r: ROUNDS, v: VALUES ]grows exponentially faster than its base sets. Sure, it is the worst-case scenario. However, in many algorithms, processes need a very small subset of the messages to make the next step. Intuitively, we do not need an arbitrary subset of all possible messages, but we need an arbitrary subset of all possible messages up to a reasonable bound. Moreover, it's much easier to debug an inductive invariant on small subsets first and then check it on larger sets.The community modules have the operator kSubset to work around this problem. TLC has a custom implementation of
kSubset. In the above example, we could replaceSUBSETwithkSubsetas follows:What is$k$ ? Well, it can be a small constant that helps us to obtain counterexamples fast. If the model checker does not give us a counterexample, we should increase $k$ , or convince ourselves that the current value of $k$ is sufficient to check our properties. Sometimes, after debugging the inductive invariant, we can even replace
kSubsetwithSUBSETagain, when the invariant is tight enough for the model checker to work.While$O(n^2)$ for a set $S$ that potentially has $n$ elements. TLC does not have any issues with that, as it enumerates states, and it can precisely give us the set cardinality in every state.
kSubsetworks in TLC, I do not know how to efficiently implement it in Apalache. Technically, we could enumerate all potential k-subsets of the base set and use them as the elements ofkSubset. However, the definition ofkSubsetuses the set cardinality, and Apalache cannot statically compute the cardinality of an arbitrary set. The current encoding ofCardinality(S)in Apalache producesTo have something similar to$k$ . For example, if the target type of $k$ elements, and all of its elements have the width up to $k$ . For example, if the set elements are sequences, their length must be bounded with $k$ .
kSubsetin Apalache, we have introduced the operatorApalache!Gen(k). This operator works in a very straightforward way. It produces completely unrestricted data structures of width up toGenisSet(T), thenGen(k)produces a symbolic set that has up toHaving
Apalache!Gen, we can rewrite the above k-subset predicate as follows:Note that this version is not semantically equivalent to the one with$k$ . If we want a completely equivalent version, which we usually don't, we can add the constraint
kSubset, asGencan produce set of cardinalities from 0 toCardinality(A1) = k.Why
Genis not the best solutionApalache!Genwas introduced as a quick hack to experiment with the idea. While it technically works, there are several issues with it:Genis syntactically confusing. Semantically,Genacts as the existential quantifier\E A1 \in S: P, but syntactically, it looks very much like a deterministic computation.Genrelies on type inference to come up with the right shape of the data structure.Genuses the same boundAn alternative solution
Alternatively, we could use a special pattern of
\E x: Pto restrict the scope. We could rewrite the above example as:This version looks much more understandable. We just produce a completely unrestricted constant
A1and further restrict it with a predicate.Implementation challenges
There are a few things that are not yet clear to me:
A1in the above example? We would probably need a type annotation.\E A1? We would need some form of static analysis to go over the predicate and compute the bounds. This problem is somewhat similar to computing the bounds ini..j[FEATURE] Interval analysis to improve a..b #446.Beta Was this translation helpful? Give feedback.
All reactions