Skip to content

Pipeline

Alexey Andreev edited this page Feb 2, 2014 · 11 revisions

Pipeline

Parser

Parser gets .class files and produces intermediate representation, so called model. TeaVM uses SSA form to represent code. It uses standard algorithm to build SSA, based on dominators and dominator frontiers. The algorithm description is available in the paper by R. Cytron et al. The TeaVM implementation does not try to reduce the number of phi functions, relying on further stages of pipeline.

Dependency checker

TeaVM has a very smart dependency checker to cut off a large amount of unnecessary methods. Instead of checking class-to-class dependency, TeaVM tries to figure dependencies on method level. It is a little tricky thing to deal with virtual methods, as for each call site you have to know which implementations are available. So TeaVM uses idea, described in the paper by Vijay Sundaresan et al. For each achievable method we build a data flow graph. For every NEW, LDC, ANEWARRAY we put class name to the instruction value receiver (i.e. variable which gets result of instruction execution). Then we propagate all the class names along DFG's edges. When we meet INVOKEVIRTUAL or INVOKEINTERFACE instructions, we search for the corresponding method implementation and build its DFG. We connect DFGs of methods according to variables passed by caller. So we end up with a global data flow graph where each node represents a variable in some method and for each variable we know a set of classes which achieve this variable. Of course, we always deal with false positive.

Clone this wiki locally