Skip to content

Commit 36412d2

Browse files
committed
Add enumerative loop invariant synthesizer
Implement the functionality described below. Motivation --- This loop invariant synthesizer use the idea counter-example-guided synthesis (CEGIS) to synthesize loop invariants for programs with only checks instrumented by `goto-instrument` with flag `--pointer-check`. This PR contain the driver of the synthesizer and the verifier we use to check invariant candidates. Verifier --- The verifier take as input a goto program with pointer checks and a map from loop id to loop invariant candidates. It first annotate and apply the loop invariant candidates into the goto model; and then simulate the CBMC api to verify the instrumented goto program. If there are some violations---loop invariants are not inductive, or some pointer checks fail---, it record valuation from trace generated by the back end to construct a formatted counterexample `cext`. Counterexample --- A counterexample `cext` record valuations of variables in the trace, and other information about the violation. The valuation we record including 1. set of live variables upon the entry of the loop. 2. the havoced value of all primitive-typed variables; 3. the havoced offset and the object size of all pointer-typed variables; 4. history values of 2 and 3. The valuations will be used as true positive (history values) and true negative (havcoed valuation) to filter out bad invariant clause with the idea of the Daikon invariant detector in a following PR. However, in this PR we only construct the valuation but not actually use them. Synthesizer --- Loop invariants we synthesize are of the form `` (in_clause || !guard) && (!guard -> pos_clause)`` where `in_clause` and `out_clause` are predicates we store in two different map, and `guard` is the loop guard. The idea is that we separately synthesize the condition `in_clause` that should hold before the execution of the loop body, and condition `pos_clause` that should hold as post-condition of the loop. When the violation happen in the loop, we update `in_clause`. When the violation happen after the loop, we update `pos_clause`. When the invariant candidate it not inductive, we enumerate strengthening clause to make it inductive. To be more efficient, we choose different synthesis strategy for different type of violation * For out-of-boundary violation, we choose to use the violated predicate as the new clause, which is the WLP of the violation if the violation is only dependent on the havocing instruction. **TODO**: to make it more complete, we need to implement a WLP algorithm * For null-pointer violation, we choose `__CPROVER_same_object(ptr, __CPROVER_loop_entry(ptr))` as the new clause. That is, the havoced pointer should points to the same object at the start of every iteration of the loop. It is a heuristic choice. This can be extended with the idea of alias analysis if needed. * For invariant-not-preserved violation, we enumerate strengthening clauses and check that if the invariant will be inductive after strengthening (disjunction with the new clause). The synthesizer works as follow 1. initialize the two invariant maps, 2. verify the current candidates built from the two maps, _a. return the candidate if there is no violation _b. synthesize a new clause to resolve the **first** violation and add it to the correct map, 3. repeat 2. The flag `synthesize-loop-invariants` will also apply synthesized loop contracts.
1 parent 3e490c9 commit 36412d2

28 files changed

+1412
-34
lines changed

regression/contracts/loop_invariant_synthesis_01/test.desc renamed to regression/contracts/loop_contract_synthesis_01/test.desc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
CORE
22
main.c
3-
--synthesize-loop-invariants --apply-loop-contracts
3+
--synthesize-loop-invariants
44
^EXIT=0$
55
^SIGNAL=0$
66
^\[main\.\d+\] line 10 Check loop invariant before entry: SUCCESS$
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#define SIZE 80
2+
3+
void main()
4+
{
5+
unsigned len;
6+
__CPROVER_assume(len <= SIZE);
7+
__CPROVER_assume(len >= 8);
8+
char *array = malloc(len);
9+
unsigned s = 0;
10+
11+
for(unsigned i = 0; i < SIZE; ++i)
12+
{
13+
if(i == len - 1)
14+
break;
15+
s += array[i];
16+
}
17+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
CORE
2+
main.c
3+
--pointer-check --synthesize-loop-invariants
4+
^EXIT=0$
5+
^SIGNAL=0$
6+
^\[main.pointer\_dereference.\d+\] .* SUCCESS$
7+
^VERIFICATION SUCCESSFUL$
8+
--
9+
--
10+
This test shows that loop invariant with form of range predicates can be correctly
11+
synthesized for programs with only pointer checks but no other assertions.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
#define SIZE 80
2+
3+
void main()
4+
{
5+
unsigned long len;
6+
__CPROVER_assume(len <= SIZE);
7+
__CPROVER_assume(len >= 8);
8+
char *array = malloc(len);
9+
const char *end = array + len;
10+
unsigned s = 0;
11+
12+
while(array != end)
13+
{
14+
s += *array++;
15+
}
16+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
CORE
2+
main.c
3+
--pointer-check --synthesize-loop-invariants
4+
^EXIT=0$
5+
^SIGNAL=0$
6+
^\[main.pointer\_dereference.\d+\] .* SUCCESS$
7+
^VERIFICATION SUCCESSFUL$
8+
--
9+
--
10+
This test shows that loop invariant with form of range predicates and same
11+
object predicates can be correctly synthesized for programs with only pointer
12+
checks but no other assertions.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#define SIZE 80
2+
3+
void main()
4+
{
5+
unsigned long len;
6+
__CPROVER_assume(len <= SIZE);
7+
__CPROVER_assume(len >= 8);
8+
char *array = malloc(len);
9+
unsigned long s = 0;
10+
11+
unsigned long j = 0;
12+
for(unsigned long i = 0; i < len; i++)
13+
{
14+
s += array[j];
15+
j++;
16+
}
17+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
CORE
2+
main.c
3+
--pointer-check --synthesize-loop-invariants
4+
^EXIT=0$
5+
^SIGNAL=0$
6+
^\[main.pointer\_dereference.\d+\] .* SUCCESS$
7+
^VERIFICATION SUCCESSFUL$
8+
--
9+
--
10+
This test shows that the loop-invariant synthesizer can enumerate
11+
strengthening clauses for invariant-not-preserved violation.

src/Makefile

+2-1
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,8 @@ goto-harness.dir: util.dir goto-programs.dir langapi.dir linking.dir \
7878
goto-instrument.dir
7979

8080
goto-instrument.dir: languages goto-programs.dir pointer-analysis.dir \
81-
goto-symex.dir linking.dir analyses.dir solvers.dir
81+
goto-symex.dir linking.dir analyses.dir solvers.dir \
82+
goto-checker.dir
8283

8384
goto-checker.dir: solvers.dir goto-symex.dir goto-programs.dir
8485

src/analyses/dependence_graph.cpp

+47
Original file line numberDiff line numberDiff line change
@@ -383,6 +383,53 @@ void dependence_grapht::add_dep(
383383
nodes[n_to].in[n_from].add(kind);
384384
}
385385

386+
bool dependence_grapht::is_flow_dependent(
387+
const goto_programt::const_targett &from,
388+
const goto_programt::const_targett &to)
389+
{
390+
std::set<node_indext> visited = std::set<node_indext>();
391+
const dep_graph_domaint from_domain = static_cast<const dep_graph_domaint &>(
392+
*storage->abstract_state_before(from, *domain_factory));
393+
const dep_graph_domaint to_domain = static_cast<const dep_graph_domaint &>(
394+
*storage->abstract_state_before(to, *domain_factory));
395+
return is_flow_dependent(from_domain, to_domain, visited);
396+
}
397+
398+
bool dependence_grapht::is_flow_dependent(
399+
const dep_graph_domaint &from,
400+
const dep_graph_domaint &to,
401+
std::set<node_indext> &visited)
402+
{
403+
// `to` is control dependent on `from`?
404+
for(const auto node : to.get_control_deps())
405+
{
406+
if(visited.count((*this)[node].get_node_id()))
407+
continue;
408+
409+
visited.insert((*this)[node].get_node_id());
410+
411+
if(
412+
from.get_node_id() == (*this)[node].get_node_id() ||
413+
is_flow_dependent(from, (*this)[node], visited))
414+
return true;
415+
}
416+
417+
// `to` is data dependent on `from`?
418+
for(const auto node : to.get_data_deps())
419+
{
420+
if(visited.count((*this)[node].get_node_id()))
421+
continue;
422+
423+
visited.insert((*this)[node].get_node_id());
424+
425+
if(
426+
from.get_node_id() == (*this)[node].get_node_id() ||
427+
is_flow_dependent(from, (*this)[node], visited))
428+
return true;
429+
}
430+
return false;
431+
}
432+
386433
void dep_graph_domaint::populate_dep_graph(
387434
dependence_grapht &dep_graph, goto_programt::const_targett this_loc) const
388435
{

src/analyses/dependence_graph.h

+19-2
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ class dep_graph_domaint:public ai_domain_baset
6767
{
6868
public:
6969
typedef grapht<dep_nodet>::node_indext node_indext;
70+
typedef std::set<goto_programt::const_targett> depst;
7071

7172
explicit dep_graph_domaint(node_indext id)
7273
: has_values(false), node_id(id), has_changed(false)
@@ -167,13 +168,20 @@ class dep_graph_domaint:public ai_domain_baset
167168
void populate_dep_graph(
168169
dependence_grapht &, goto_programt::const_targett) const;
169170

171+
depst get_control_deps() const
172+
{
173+
return control_deps;
174+
}
175+
depst get_data_deps() const
176+
{
177+
return data_deps;
178+
}
179+
170180
private:
171181
tvt has_values;
172182
node_indext node_id;
173183
bool has_changed;
174184

175-
typedef std::set<goto_programt::const_targett> depst;
176-
177185
// Set of locations with control instructions on which the instruction at this
178186
// location has a control dependency on
179187
depst control_deps;
@@ -279,7 +287,16 @@ class dependence_grapht:
279287
return rd;
280288
}
281289

290+
/// Decide whether the instruction `to` is flow dependent on `from`.
291+
bool is_flow_dependent(
292+
const goto_programt::const_targett &from,
293+
const goto_programt::const_targett &to);
294+
282295
protected:
296+
bool is_flow_dependent(
297+
const dep_graph_domaint &from,
298+
const dep_graph_domaint &to,
299+
std::set<node_indext> &visited);
283300
friend dep_graph_domain_factoryt;
284301
friend dep_graph_domaint;
285302
const namespacet &ns;

src/goto-instrument/CMakeLists.txt

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ target_link_libraries(goto-instrument-lib
1313
linking
1414
big-int
1515
goto-programs
16+
goto-checker
1617
goto-symex
1718
assembler
1819
pointer-analysis

src/goto-instrument/Makefile

+2
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ SRC = accelerate/accelerate.cpp \
8282
splice_call.cpp \
8383
stack_depth.cpp \
8484
synthesizer/enumerative_loop_invariant_synthesizer.cpp \
85+
synthesizer/cegis_verifier.cpp \
8586
synthesizer/expr_enumerator.cpp \
8687
synthesizer/synthesizer_utils.cpp \
8788
thread_instrumentation.cpp \
@@ -106,6 +107,7 @@ OBJ += ../ansi-c/ansi-c$(LIBEXT) \
106107
../cpp/cpp$(LIBEXT) \
107108
../linking/linking$(LIBEXT) \
108109
../big-int/big-int$(LIBEXT) \
110+
../goto-checker/goto-checker$(LIBEXT) \
109111
../goto-programs/goto-programs$(LIBEXT) \
110112
../goto-symex/goto-symex$(LIBEXT) \
111113
../assembler/assembler$(LIBEXT) \

src/goto-instrument/contracts/contracts.cpp

+8-2
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ void code_contractst::check_apply_loop_contracts(
142142
// i.e., the loop guard was satisfied.
143143
const auto entered_loop =
144144
new_tmp_symbol(
145-
bool_typet(), loop_head_location, mode, symbol_table, "__entered_loop")
145+
bool_typet(), loop_head_location, mode, symbol_table, ENTERED_LOOP)
146146
.symbol_expr();
147147
pre_loop_head_instrs.add(
148148
goto_programt::make_decl(entered_loop, loop_head_location));
@@ -173,7 +173,7 @@ void code_contractst::check_apply_loop_contracts(
173173
// instrumentation of the loop.
174174
const auto in_base_case =
175175
new_tmp_symbol(
176-
bool_typet(), loop_head_location, mode, symbol_table, "__in_base_case")
176+
bool_typet(), loop_head_location, mode, symbol_table, IN_BASE_CASE)
177177
.symbol_expr();
178178
pre_loop_head_instrs.add(
179179
goto_programt::make_decl(in_base_case, loop_head_location));
@@ -294,6 +294,12 @@ void code_contractst::check_apply_loop_contracts(
294294
// Generate havocing code for assignment targets.
295295
havoc_assigns_targetst havoc_gen(to_havoc, ns);
296296
havoc_gen.append_full_havoc_code(loop_head_location, pre_loop_head_instrs);
297+
// Add loop number to the havoc_code
298+
for(auto &havoc_instr : pre_loop_head_instrs.instructions)
299+
{
300+
if(is_loop_havoc(havoc_instr))
301+
havoc_instr.loop_number = loop_end->loop_number;
302+
}
297303

298304
// Insert the second block of pre_loop_head_instrs: the havocing code.
299305
// We do not `add_pragma_disable_assigns_check`,

src/goto-instrument/contracts/utils.cpp

+33
Original file line numberDiff line numberDiff line change
@@ -442,3 +442,36 @@ void generate_history_variables_initialization(
442442
// Add all the history variable initialization instructions
443443
program.destructive_append(history);
444444
}
445+
446+
bool is_transformed_loop_end(const goto_programt::const_targett &target)
447+
{
448+
// The end of the loop end of transformed loop is
449+
// ASSIGN entered_loop = true
450+
if(!target->is_assign())
451+
return false;
452+
453+
return from_expr(target->assign_lhs()).find("__entered_loop") !=
454+
std::string::npos &&
455+
target->assign_rhs() == true_exprt();
456+
}
457+
458+
bool is_assignment_to_instrumented_variable(
459+
const goto_programt::const_targett &target,
460+
std::string var_name)
461+
{
462+
INVARIANT(
463+
var_name == IN_BASE_CASE || var_name == ENTERED_LOOP,
464+
"var_name is not of instrumented variables.");
465+
466+
if(!target->is_assign())
467+
return false;
468+
469+
if(can_cast_expr<symbol_exprt>(target->assign_lhs()))
470+
{
471+
const auto &lhs = to_symbol_expr(target->assign_lhs());
472+
return id2string(lhs.get_identifier()).find("::" + var_name) !=
473+
std::string::npos;
474+
}
475+
476+
return false;
477+
}

src/goto-instrument/contracts/utils.h

+12
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,9 @@ Date: September 2021
1717

1818
#include <goto-programs/goto_convert_class.h>
1919

20+
#define IN_BASE_CASE "__in_base_case"
21+
#define ENTERED_LOOP "__entered_loop"
22+
2023
/// \brief A class that overrides the low-level havocing functions in the base
2124
/// utility class, to havoc only when expressions point to valid memory,
2225
/// i.e. if all dereferences within an expression are valid
@@ -205,4 +208,13 @@ void generate_history_variables_initialization(
205208
const irep_idt &mode,
206209
goto_programt &program);
207210

211+
/// Return true if `target` is the loop end of some transformed code.
212+
bool is_transformed_loop_end(const goto_programt::const_targett &target);
213+
214+
/// Return true if `target` is an assignment to an instrumented variable with
215+
/// name `var_name`.
216+
bool is_assignment_to_instrumented_variable(
217+
const goto_programt::const_targett &target,
218+
std::string var_name);
219+
208220
#endif // CPROVER_GOTO_INSTRUMENT_CONTRACTS_UTILS_H

src/goto-instrument/goto_instrument_parse_options.cpp

+27-10
Original file line numberDiff line numberDiff line change
@@ -1148,16 +1148,6 @@ void goto_instrument_parse_optionst::instrument_goto_program()
11481148
goto_model.goto_functions.update();
11491149
}
11501150

1151-
if(cmdline.isset("synthesize-loop-invariants"))
1152-
{
1153-
log.warning() << "Loop invariant synthesizer is still work in progress. "
1154-
"It only generates TRUE as invariants."
1155-
<< messaget::eom;
1156-
1157-
// Synthesize loop invariants and annotate them into `goto_model`
1158-
enumerative_loop_invariant_synthesizert synthesizer(goto_model, log);
1159-
annotate_invariants(synthesizer.synthesize_all(), goto_model, log);
1160-
}
11611151

11621152
bool use_dfcc = cmdline.isset(FLAG_DFCC);
11631153
if(use_dfcc)
@@ -1432,6 +1422,33 @@ void goto_instrument_parse_optionst::instrument_goto_program()
14321422
goto_check_c(options, goto_model, ui_message_handler);
14331423
transform_assertions_assumptions(options, goto_model);
14341424

1425+
if(cmdline.isset(FLAG_SYNTHESIZE_LOOP_INVARIANTS))
1426+
{
1427+
if(cmdline.isset(FLAG_LOOP_CONTRACTS))
1428+
{
1429+
throw invalid_command_line_argument_exceptiont(
1430+
"Incompatible options detected",
1431+
"--" FLAG_SYNTHESIZE_LOOP_INVARIANTS " and --" FLAG_LOOP_CONTRACTS,
1432+
"Use either --" FLAG_SYNTHESIZE_LOOP_INVARIANTS
1433+
" or --" FLAG_LOOP_CONTRACTS);
1434+
}
1435+
1436+
log.warning() << "Loop invariant synthesizer is still work in progress. "
1437+
"It only generates TRUE as invariants."
1438+
<< messaget::eom;
1439+
1440+
// Synthesize loop invariants and annotate them into `goto_model`
1441+
enumerative_loop_invariant_synthesizert synthesizer(goto_model, log);
1442+
annotate_invariants(synthesizer.synthesize_all(), goto_model);
1443+
1444+
// Apply loop contracts.
1445+
std::set<std::string> to_exclude_from_nondet_static(
1446+
cmdline.get_values("nondet-static-exclude").begin(),
1447+
cmdline.get_values("nondet-static-exclude").end());
1448+
code_contractst contracts(goto_model, log);
1449+
contracts.apply_loop_contracts(to_exclude_from_nondet_static);
1450+
}
1451+
14351452
// check for uninitalized local variables
14361453
if(cmdline.isset("uninitialized-check"))
14371454
{

0 commit comments

Comments
 (0)