Skip to content

Questions about WALA Dataflow Solver

vazexqi edited this page Feb 20, 2013 · 12 revisions

Does it assume that the transfer functions are distributive?

A distributive function f has this property f(A U B) == f(A) U f(B).

It seems that way because the ITransferFunctionProvider uses only UnaryOperator. Does that mean that it can only handle distributive functions?

No. Though the name seems a bit strange, it can definitely handle non-distributive functions. I was confused by the name because in its superclass, i.e., AbstractOperator there is a function called public abstract byte evaluate(T lhs, T[] rhs); -- it handles T[] rhs. I had (wrongly) assumed that the presence of that function meant that UnaryOperator means that the framework can only handle distributive functions. In fact, what they need to do is just apply the meet operator (A U B) before passing the value to the UnaryOperator.

With regards to the meet operator, what is the function isUnaryNoOp() for? It is used in DataflowSolver in the buildEquations(...) method to do some optimization. Under what circumstance would a meet operator be a no-op (meaning that it doesn't even copy the values over)?

Here's the relevant code in buildEquations(...)

    // add meet operations
    int meetThreshold = (meet.isUnaryNoOp() ? 2 : 1);
    for (Iterator<? extends T> it = G.iterator(); it.hasNext();) {
      T node = it.next();
      int nPred = G.getPredNodeCount(node);
      if (nPred >= meetThreshold) {
        // todo: optimize further using unary operators when possible?
        V[] rhs = makeStmtRHS(nPred);
        int i = 0;
        for (Iterator<?> it2 = G.getPredNodes(node); it2.hasNext();) {
          rhs[i++] = (functions.hasEdgeTransferFunctions()) ? getEdge(it2.next(), node) : getOut(it2.next());
        }
        newStatement(getIn(node), meet, rhs, toWorkList, eager);
      }
    }

After reading the code the isUnaryNoOp() function is just used to determine when we should insert meet operators. The default case should be isUnaryNoOp() is true. After all, if we are not "meeting" two or more items, there is not need to insert a meet operator.

The only net outcome of this (that is different from just inserting meet operators everywhere when isUnaryNoOp() is false) is that the last statement in your dataflow graph might not have any data associated with the out.

With regards to the dataflow solver, is there a way to then to optimize for precision when we do have a distributive function? It has been established (consult any dataflow analysis book) that if we a distributive function, then the Meet-Fixed-Point (MFP) solution is the same as the Meet-Over-Paths (MOP) solution. Otherwise, MFP <= MOP

The meet operator is inserted into the dataflow graph everytime there is more than 2 predecessor nodes (or depending on whether you set isUnaryNoOp() to false; then it adds one for all nodes!). Since meet operators should not have a special transfer functions, you should "grab" the transfer function of the successor node:

  (Sketch of how to accomplish this)

  A  B
  |  |
 _|__|_
| meet | If you don't have a distributive function, you do A U B here and then pass the result to F,         
 ------  i.e., F(A U B). 
   |
 ------ 
|  F   | If you have a distributive function, you should grab F and perform F(A) U F(B). You need a way to
 ------  notify the dataflow solver not to evaluate F though if it immediately after a meet node.
   |
  ...

Why do some AbstractVariables, e.g. BitVectorVariable use a sameValue(...) method instead of just equals(...)?

It seems that they wanted to retain equals so that it can perform reference-based equality (when the need arises). If they had just overriden `equals(...) for content-based equality, they would have lost this functionality.

Clone this wiki locally