Skip to content

Compiler_Implementation

Chris Ryder edited this page Dec 4, 2018 · 3 revisions

GNAT2GOTO Implementation

Overview

GNAT2GOTO is implemented as a backend to the GNAT Ada Compiler. Most of the code lives in the driver directory, where the entry point for gnat2goto is the back_end.adb - this file, together driver.adb contain the setup code needed for the GNAT parts of GNAT2GOTO.

The data definitions for the Ada representation GOTO programs can be found in the ireps directory, where they are being autogenerated from specifications in ireps/specs with the ireps_generator.py script.

The translation of Ada AST nodes (supplied by the GNAT frontend) to GOTO Ireps happens in driver/tree_walk.adb. In here, the AST is recursively traversed (the entry point being Do_Compilation_Unit) and equivalent GOTO programs are generated. See the CProver documentation for a short overview of the structure of GOTO programs. In short, the output of the Tree_Walk routine is a Symbol_Table containing global variables, functions, types etc representing the same semantics as the original Ada program. The contents of this symbol table are then output as a .json_symtab file which can be read by the cbmc program.

Data structures

Ireps

Ireps are the fundamental data structure making up a GOTO program. They represent a generic tree structure where each node has a string value, as well as optionally positional and named children (which are also ireps).

Because of how Ada memory management works here they are organised in a slightly different way compared their C++ counterpart; Rather than representing the tree structure directly through pointers, each Irep is an index into a table containing a fixed size structure representing the properties of that Irep. Unlike the C++ version, Ireps can not have arbitrary children but rather the fields of the (private) Irep_Node structure take on different meaning depending on what kind of irep it is - for example, an Irep representing a plus operation might be represented by an Irep node of kind plus, where 3 of the fields are indices into the Irep table, one for the type of the expression and 1 for each child (we represent plus operations as binary operations, unlike in C++ where they can be n-ary). String fields can be represented as indices into a string table. In actual usage we can use setters, getters and constructors which are generated from the irep spec, so for example for plus we could use the Make_Op_Plus function.

Link to GNAT documentation (otherwise refer to source documentation in obj)

Clone this wiki locally