Skip to content

Commit 6827ce5

Browse files
committed
Add enumerative loop invariant synthesizer
Implement the functionality described below. Motivation --- This loop invariant synthesizer use the idea counter-example-guided synthesis (CEGIS) to synthesize loop invariants for programs with only checks instrumented by `goto-instrument` with flag `--pointer-check`. This PR contain the driver of the synthesizer and the verifier we use to check invariant candidates. Verifier --- The verifier take as input a goto program with pointer checks and a map from loop id to loop invariant candidates. It first annotate and apply the loop invariant candidates into the goto model; and then simulate the CBMC api to verify the instrumented goto program. If there are some violations---loop invariants are not inductive, or some pointer checks fail---, it record valuation from trace generated by the back end to construct a formatted counterexample `cext`. Counterexample --- A counterexample `cext` record valuations of variables in the trace, and other information about the violation. The valuation we record including 1. set of live variables upon the entry of the loop. 2. the havoced value of all primitive-typed variables; 3. the havoced offset and the object size of all pointer-typed variables; 4. history values of 2 and 3. The valuations will be used as true positive (history values) and true negative (havcoed valuation) to filter out bad invariant clause with the idea of the Daikon invariant detector in a following PR. However, in this PR we only construct the valuation but not actually use them. Synthesizer --- Loop invariants we synthesize are of the form `` (in_clause || !guard) && (!guard -> pos_clause)`` where `in_clause` and `out_clause` are predicates we store in two different map, and `guard` is the loop guard. The idea is that we separately synthesize the condition `in_clause` that should hold before the execution of the loop body, and condition `pos_clause` that should hold as post-condition of the loop. When the violation happen in the loop, we update `in_clause`. When the violation happen after the loop, we update `pos_clause`. When the invariant candidate it not inductive, we enumerate strengthening clause to make it inductive. To be more efficient, we choose different synthesis strategy for different type of violation * For out-of-boundary violation, we choose to use the violated predicate as the new clause, which is the WLP of the violation if the violation is only dependent on the havocing instruction. **TODO**: to make it more complete, we need to implement a WLP algorithm * For null-pointer violation, we choose `__CPROVER_same_object(ptr, __CPROVER_loop_entry(ptr))` as the new clause. That is, the havoced pointer should points to the same object at the start of every iteration of the loop. It is a heuristic choice. This can be extended with the idea of alias analysis if needed. * For invariant-not-preserved violation, we enumerate strengthening clauses and check that if the invariant will be inductive after strengthening (disjunction with the new clause). The synthesizer works as follow 1. initialize the two invariant maps, 2. verify the current candidates built from the two maps, _a. return the candidate if there is no violation _b. synthesize a new clause to resolve the **first** violation and add it to the correct map, 3. repeat 2.
1 parent 6f40463 commit 6827ce5

26 files changed

+1490
-49
lines changed

regression/goto-synthesizer/CMakeLists.txt

+1-2
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,8 @@ else()
1212
set(gcc_only_string "")
1313
endif()
1414

15-
1615
add_test_pl_tests(
17-
"${CMAKE_CURRENT_SOURCE_DIR}/chain.sh $<TARGET_FILE:goto-cc> $<TARGET_FILE:goto-synthesizer> $<TARGET_FILE:cbmc> ${is_windows}"
16+
"${CMAKE_CURRENT_SOURCE_DIR}/chain.sh $<TARGET_FILE:goto-cc> $<TARGET_FILE:goto-instrument> $<TARGET_FILE:goto-synthesizer> $<TARGET_FILE:cbmc> ${is_windows}"
1817
)
1918

2019
## Enabling these causes a very significant increase in the time taken to run the regressions

regression/goto-synthesizer/Makefile

+3-3
Original file line numberDiff line numberDiff line change
@@ -14,16 +14,16 @@ else
1414
endif
1515

1616
test:
17-
@../test.pl -e -p -c '../chain.sh $(exe) ../../../src/goto-synthesizer/goto-synthesizer ../../../src/cbmc/cbmc $(is_windows)' -X smt-backend $(GCC_ONLY)
17+
@../test.pl -e -p -c '../chain.sh $(exe) ../../../src/goto-instrument/goto-instrument ../../../src/goto-synthesizer/goto-synthesizer ../../../src/cbmc/cbmc $(is_windows)' -X smt-backend $(GCC_ONLY)
1818

1919
test-cprover-smt2:
20-
@../test.pl -e -p -c '../chain.sh $(exe) ../../../src/goto-synthesizer/goto-synthesizer "../../../src/cbmc/cbmc --cprover-smt2" $(is_windows)' \
20+
@../test.pl -e -p -c '../chain.sh $(exe) ../../../src/goto-instrument/goto-instrument ../../../src/goto-synthesizer/goto-synthesizer "../../../src/cbmc/cbmc --cprover-smt2" $(is_windows)' \
2121
-X broken-smt-backend -X thorough-smt-backend \
2222
-X broken-cprover-smt-backend -X thorough-cprover-smt-backend \
2323
-s cprover-smt2 $(GCC_ONLY)
2424

2525
test-z3:
26-
@../test.pl -e -p -c '../chain.sh $(exe) ../../../src/goto-synthesizer/goto-synthesizer "../../../src/cbmc/cbmc --z3" $(is_windows)' \
26+
@../test.pl -e -p -c '../chain.sh $(exe) ../../../src/goto-instrument/goto-instrument ../../../src/goto-synthesizer/goto-synthesizer "../../../src/cbmc/cbmc --z3" $(is_windows)' \
2727
-X broken-smt-backend -X thorough-smt-backend \
2828
-X broken-z3-smt-backend -X thorough-z3-smt-backend \
2929
-s z3 $(GCC_ONLY)

regression/goto-synthesizer/chain.sh

+12-6
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,15 @@
33
set -e
44

55
goto_cc=$1
6-
goto_synthesizer=$2
7-
cbmc=$3
8-
is_windows=$4
6+
goto_instrument=$2
7+
goto_synthesizer=$3
8+
cbmc=$4
9+
is_windows=$5
910

1011
name=${*:$#}
1112
name=${name%.c}
1213

13-
args=${*:5:$#-5}
14+
args=${*:6:$#-6}
1415
if [[ "$args" != *" _ "* ]]
1516
then
1617
args_inst=$args
@@ -27,7 +28,9 @@ else
2728
fi
2829

2930
rm -f "${name}-mod.gb"
30-
$goto_synthesizer ${args_inst} "${name}.gb" "${name}-mod.gb"
31+
rm -f "${name}-mod-2.gb"
32+
echo "Running goto-instrument: "
33+
$goto_instrument ${args_inst} "${name}.gb" "${name}-mod.gb"
3134
if [ ! -e "${name}-mod.gb" ] ; then
3235
cp "$name.gb" "${name}-mod.gb"
3336
elif echo $args_inst | grep -q -- "--dump-c" ; then
@@ -41,4 +44,7 @@ elif echo $args_inst | grep -q -- "--dump-c" ; then
4144

4245
rm "${name}-mod.c"
4346
fi
44-
$cbmc "${name}-mod.gb" ${args_cbmc}
47+
echo "Running goto-synthesizer: "
48+
$goto_synthesizer "${name}-mod.gb" "${name}-mod-2.gb"
49+
echo "Running CBMC: "
50+
$cbmc "${name}-mod-2.gb" ${args_cbmc}

regression/goto-synthesizer/loop_contracts_synthesis_01/test.desc

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
CORE
22
main.c
3-
3+
--pointer-check
44
^EXIT=0$
55
^SIGNAL=0$
66
^\[main\.\d+\] line 10 Check loop invariant before entry: SUCCESS$
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#define SIZE 80
2+
3+
void main()
4+
{
5+
unsigned len;
6+
__CPROVER_assume(len <= SIZE);
7+
__CPROVER_assume(len >= 8);
8+
char *array = malloc(len);
9+
unsigned s = 0;
10+
11+
for(unsigned i = 0; i < SIZE; ++i)
12+
{
13+
if(i == len - 1)
14+
break;
15+
s += array[i];
16+
}
17+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
CORE
2+
main.c
3+
--pointer-check
4+
^EXIT=0$
5+
^SIGNAL=0$
6+
^\[main.pointer\_dereference.\d+\] .* SUCCESS$
7+
^VERIFICATION SUCCESSFUL$
8+
--
9+
--
10+
This test shows that loop invariants using range predicates can be correctly
11+
synthesized for programs with only pointer checks but no other assertions.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
#define SIZE 80
2+
3+
void main()
4+
{
5+
unsigned long len;
6+
__CPROVER_assume(len <= SIZE);
7+
__CPROVER_assume(len >= 8);
8+
char *array = malloc(len);
9+
const char *end = array + len;
10+
unsigned s = 0;
11+
12+
while(array != end)
13+
{
14+
s += *array++;
15+
}
16+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
CORE
2+
main.c
3+
--pointer-check
4+
^EXIT=0$
5+
^SIGNAL=0$
6+
^\[main.pointer\_dereference.\d+\] .* SUCCESS$
7+
^VERIFICATION SUCCESSFUL$
8+
--
9+
--
10+
This test shows that loop invariants using range predicates and same-object
11+
predicates can be correctly synthesized for programs with only pointer
12+
checks but no other assertions.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#define SIZE 80
2+
3+
void main()
4+
{
5+
unsigned long len;
6+
__CPROVER_assume(len <= SIZE);
7+
__CPROVER_assume(len >= 8);
8+
char *array = malloc(len);
9+
unsigned long s = 0;
10+
11+
unsigned long j = 0;
12+
for(unsigned long i = 0; i < len; i++)
13+
{
14+
s += array[j];
15+
j++;
16+
}
17+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
CORE
2+
main.c
3+
--pointer-check
4+
^EXIT=0$
5+
^SIGNAL=0$
6+
^\[main.pointer\_dereference.\d+\] .* SUCCESS$
7+
^VERIFICATION SUCCESSFUL$
8+
--
9+
--
10+
This test shows that the loop-invariant synthesizer can enumerate
11+
strengthening clauses for invariant-not-preserved violation.

src/goto-instrument/contracts/contracts.cpp

+90-6
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ void code_contractst::check_apply_loop_contracts(
5959
const irep_idt &mode)
6060
{
6161
const auto loop_head_location = loop_head->source_location();
62+
const auto loop_number = loop_end->loop_number;
6263

6364
// Vector representing a (possibly multidimensional) decreases clause
6465
const auto &decreases_clause_exprs = decreases_clause.operands();
@@ -95,10 +96,12 @@ void code_contractst::check_apply_loop_contracts(
9596
// | STEP:
9697
// --. | assert (initial_invariant_val);
9798
// loop assigns check | | in_base_case = false;
98-
// - not applicable >======= havoc (assigns_set);
99-
// func assigns check | | assume (invariant_expr);
100-
// + deferred | `- old_variant_val = decreases_clause_expr;
101-
// --' * HEAD:
99+
// - not applicable >======= in_loop_havoc_block = true;
100+
// func assigns check | | havoc (assigns_set);
101+
// + deferred | | in_loop_havoc_block = false;
102+
// --' | assume (invariant_expr);
103+
// `- old_variant_val = decreases_clause_expr;
104+
// * HEAD:
102105
// loop assigns check ,- ... eval guard ...
103106
// + assertions added | if (!guard)
104107
// func assigns check | goto EXIT;
@@ -142,7 +145,11 @@ void code_contractst::check_apply_loop_contracts(
142145
// i.e., the loop guard was satisfied.
143146
const auto entered_loop =
144147
new_tmp_symbol(
145-
bool_typet(), loop_head_location, mode, symbol_table, "__entered_loop")
148+
bool_typet(),
149+
loop_head_location,
150+
mode,
151+
symbol_table,
152+
std::string(ENTERED_LOOP) + "__" + std::to_string(loop_number))
146153
.symbol_expr();
147154
pre_loop_head_instrs.add(
148155
goto_programt::make_decl(entered_loop, loop_head_location));
@@ -153,7 +160,7 @@ void code_contractst::check_apply_loop_contracts(
153160
// if the loop is not vacuous and must be abstracted with contracts.
154161
const auto initial_invariant_val =
155162
new_tmp_symbol(
156-
bool_typet(), loop_head_location, mode, symbol_table, "__init_invariant")
163+
bool_typet(), loop_head_location, mode, symbol_table, INIT_INVARIANT)
157164
.symbol_expr();
158165
pre_loop_head_instrs.add(
159166
goto_programt::make_decl(initial_invariant_val, loop_head_location));
@@ -292,8 +299,25 @@ void code_contractst::check_apply_loop_contracts(
292299
loop_head, add_pragma_disable_assigns_check(pre_loop_head_instrs));
293300

294301
// Generate havocing code for assignment targets.
302+
// ASSIGN in_loop_havoc_block = true;
303+
// havoc (assigns_set);
304+
// ASSIGN in_loop_havoc_block = false;
305+
const auto in_loop_havoc_block =
306+
new_tmp_symbol(
307+
bool_typet(),
308+
loop_head_location,
309+
mode,
310+
symbol_table,
311+
std::string(IN_LOOP_HAVOC_BLOCK) + +"__" + std::to_string(loop_number))
312+
.symbol_expr();
313+
pre_loop_head_instrs.add(
314+
goto_programt::make_decl(in_loop_havoc_block, loop_head_location));
315+
pre_loop_head_instrs.add(
316+
goto_programt::make_assignment(in_loop_havoc_block, true_exprt{}));
295317
havoc_assigns_targetst havoc_gen(to_havoc, ns);
296318
havoc_gen.append_full_havoc_code(loop_head_location, pre_loop_head_instrs);
319+
pre_loop_head_instrs.add(
320+
goto_programt::make_assignment(in_loop_havoc_block, false_exprt{}));
297321

298322
// Insert the second block of pre_loop_head_instrs: the havocing code.
299323
// We do not `add_pragma_disable_assigns_check`,
@@ -1414,6 +1438,66 @@ void code_contractst::apply_loop_contracts(
14141438
unwindset.parse_unwindset(loop_names, log.get_message_handler());
14151439
goto_unwindt goto_unwind;
14161440
goto_unwind(goto_model, unwindset, goto_unwindt::unwind_strategyt::ASSUME);
1441+
1442+
remove_skip(goto_model);
1443+
1444+
// Record original loop number for some instrumented instructions.
1445+
for(auto &goto_function_entry : goto_functions.function_map)
1446+
{
1447+
auto &goto_function = goto_function_entry.second;
1448+
bool is_in_loop_havoc_block = false;
1449+
1450+
unsigned loop_number_of_loop_havoc = 0;
1451+
for(goto_programt::const_targett it_instr =
1452+
goto_function.body.instructions.begin();
1453+
it_instr != goto_function.body.instructions.end();
1454+
it_instr++)
1455+
{
1456+
// Don't override original loop numbers.
1457+
if(original_loop_number_map.count(it_instr) != 0)
1458+
continue;
1459+
1460+
// Store loop number for
1461+
// ASSIGN ENTERED_LOOP = TRUE
1462+
if(
1463+
is_assignment_to_instrumented_variable(it_instr, ENTERED_LOOP) &&
1464+
it_instr->assign_rhs() == true_exprt())
1465+
{
1466+
const auto &assign_lhs =
1467+
expr_try_dynamic_cast<symbol_exprt>(it_instr->assign_lhs());
1468+
original_loop_number_map[it_instr] = get_suffix_unsigned(
1469+
id2string(assign_lhs->get_identifier()),
1470+
std::string(ENTERED_LOOP) + "__");
1471+
continue;
1472+
}
1473+
1474+
// Loop havocs are assignments between
1475+
// ASSIGN IN_LOOP_HAVOC_BLOCK = true
1476+
// and
1477+
// ASSIGN IN_LOOP_HAVOC_BLOCK = false
1478+
1479+
// Entering the loop-havoc block.
1480+
if(is_assignment_to_instrumented_variable(it_instr, IN_LOOP_HAVOC_BLOCK))
1481+
{
1482+
is_in_loop_havoc_block = it_instr->assign_rhs() == true_exprt();
1483+
const auto &assign_lhs =
1484+
expr_try_dynamic_cast<symbol_exprt>(it_instr->assign_lhs());
1485+
loop_number_of_loop_havoc = get_suffix_unsigned(
1486+
id2string(assign_lhs->get_identifier()),
1487+
std::string(IN_LOOP_HAVOC_BLOCK) + "__");
1488+
continue;
1489+
}
1490+
1491+
// Assignments in loop-havoc block are loop havocs.
1492+
if(is_in_loop_havoc_block && it_instr->is_assign())
1493+
{
1494+
loop_havoc_set.emplace(it_instr);
1495+
1496+
// Store loop number for loop havoc.
1497+
original_loop_number_map[it_instr] = loop_number_of_loop_havoc;
1498+
}
1499+
}
1500+
}
14171501
}
14181502

14191503
void code_contractst::enforce_contracts(

src/goto-instrument/contracts/contracts.h

+23
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,18 @@ class code_contractst
122122
symbol_tablet &get_symbol_table();
123123
goto_functionst &get_goto_functions();
124124

125+
std::unordered_map<goto_programt::const_targett, unsigned, const_target_hash>
126+
get_original_loop_number_map() const
127+
{
128+
return original_loop_number_map;
129+
}
130+
131+
std::unordered_set<goto_programt::const_targett, const_target_hash>
132+
get_loop_havoc_set() const
133+
{
134+
return loop_havoc_set;
135+
}
136+
125137
namespacet ns;
126138

127139
protected:
@@ -137,6 +149,17 @@ class code_contractst
137149
/// Name of loops we are going to unwind.
138150
std::list<std::string> loop_names;
139151

152+
/// Store the map from instrumented instructions for loop contracts to their
153+
/// original loop numbers. Following instrumented instructions are stored.
154+
/// 1. loop-havoc --- begin of transformed loops
155+
/// 2. ASSIGN ENTERED_LOOP = TRUE --- end of transformed loops
156+
std::unordered_map<goto_programt::const_targett, unsigned, const_target_hash>
157+
original_loop_number_map;
158+
159+
/// Loop havoc instructions instrumneted during applying loop contracts.
160+
std::unordered_set<goto_programt::const_targett, const_target_hash>
161+
loop_havoc_set;
162+
140163
public:
141164
/// \brief Enforce contract of a single function
142165
void enforce_contract(const irep_idt &function);

src/goto-instrument/contracts/utils.cpp

+46
Original file line numberDiff line numberDiff line change
@@ -447,3 +447,49 @@ void generate_history_variables_initialization(
447447
// Add all the history variable initialization instructions
448448
program.destructive_append(history);
449449
}
450+
451+
bool is_transformed_loop_end(const goto_programt::const_targett &target)
452+
{
453+
// The end of the loop end of transformed loop is
454+
// ASSIGN entered_loop = true
455+
if(!is_assignment_to_instrumented_variable(target, ENTERED_LOOP))
456+
return false;
457+
458+
return target->assign_rhs() == true_exprt();
459+
}
460+
461+
bool is_assignment_to_instrumented_variable(
462+
const goto_programt::const_targett &target,
463+
std::string var_name)
464+
{
465+
INVARIANT(
466+
var_name == IN_BASE_CASE || var_name == ENTERED_LOOP ||
467+
var_name == IN_LOOP_HAVOC_BLOCK,
468+
"var_name is not of instrumented variables.");
469+
470+
if(!target->is_assign())
471+
return false;
472+
473+
if(can_cast_expr<symbol_exprt>(target->assign_lhs()))
474+
{
475+
const auto &lhs = to_symbol_expr(target->assign_lhs());
476+
return id2string(lhs.get_identifier()).find("::" + var_name) !=
477+
std::string::npos;
478+
}
479+
480+
return false;
481+
}
482+
483+
unsigned get_suffix_unsigned(const std::string &str, const std::string &prefix)
484+
{
485+
// first_index is the end of the `prefix`.
486+
auto first_index = str.find(prefix);
487+
INVARIANT(
488+
first_index != std::string::npos, "Prefix not found in the given string");
489+
first_index += prefix.length();
490+
491+
// last_index is the index of not-digit.
492+
auto last_index = str.find_first_not_of("0123456789", first_index);
493+
std::string result = str.substr(first_index, last_index - first_index);
494+
return std::stol(result);
495+
}

0 commit comments

Comments
 (0)