CONTRACTS: Add enumerative loop invariant synthesizer #7393

qinheping · 2022-11-28T07:38:29Z

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
n/a My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Implement the functionality described below.

Motivation

This loop invariant synthesizer use the idea counter-example-guided synthesis (CEGIS) to synthesize loop invariants for programs with only checks instrumented by goto-instrument with flag --pointer-check.

This PR contain the driver of the synthesizer and the verifier we use to check invariant candidates.

Verifier

The verifier take as input a goto program with pointer checks and a map from loop id to loop invariant candidates. It first annotate and apply the loop invariant candidates into the goto model; and then simulate the CBMC api to verify the instrumented goto program. If there are some violations---loop invariants are not inductive, or some pointer checks fail---, it record valuation from trace generated by the back end to construct a formatted counterexample cext.

Counterexample

A counterexample cext record valuations of variables in the trace, and other information about the violation. The valuation we record including

set of live variables upon the entry of the loop.
the havoced value of all primitive-typed variables;
the havoced offset and the object size of all pointer-typed variables;
history values of 2 and 3.

The valuations will be used as true positive (history values) and true negative (havcoed valuation) to filter out bad invariant clause with the idea of the Daikon invariant detector in a following PR. However, in this PR we only construct the valuation but not actually use them.

Synthesizer

Loop invariants we synthesize are of the form
(in_clause || !guard) && (!guard -> pos_clause)
where in_clause and out_clause are predicates we store in two different map, and guard is the loop guard. The idea is that we separately synthesize the condition in_clause that should hold before the execution of the loop body, and condition pos_clause that should hold as post-condition of the loop.

When the violation happen in the loop, we update in_clause. When the violation happen after the loop, we update pos_clause. When the invariant candidate it not inductive, we enumerate strengthening clause to make it inductive.

To be more efficient, we choose different synthesis strategy for different type of violation

For out-of-boundary violation, we choose to use the violated predicate as the new clause, which is the WLP of the violation if the violation is only dependent on the havocing instruction. TODO: to make it more complete, we need to implement a WLP algorithm
For null-pointer violation, we choose __CPROVER_same_object(ptr, __CPROVER_loop_entry(ptr)) as the new clause. That is, the havoced pointer should points to the same object at the start of every iteration of the loop. It is a heuristic choice. This can be extended with the idea of alias analysis if needed.
For invariant-not-preserved violation, we enumerate strengthening clauses and check that if the invariant will be inductive after strengthening (disjunction with the new clause).

The synthesizer works as follow

initialize the two invariant maps,
verify the current candidates built from the two maps,
_a. return the candidate if there is no violation
_b. synthesize a new clause to resolve the first violation and add it to the correct map,
repeat 2.

The flag synthesize-loop-invariants will also apply synthesized loop contracts.

tautschnig · 2022-11-28T09:40:41Z

You have an extensive PR description, which is really nice. Would you mind also including that in the commit message, or at least an executive summary thereof?

tautschnig · 2022-11-28T09:42:00Z

regression/contracts/loop_contract_synthesis_02/test.desc

+This test shows that loop invariant with form of range predicates can be correctly
+synthesized for programs with only pointer checks but no other assertions.


So what happens when there are additional assertions?

The synthesizer tries to solve violation one by one. If there are other violations beyond pointer checks and invariant checks, it will throw an exception saying the type of violation is unsupported. My plan is focusing on pointer checks in this PR, and create another PR to add the enumerating-and-check mode that keeps enumerating until all checks pass for other type of violations.

tautschnig · 2022-11-28T09:44:59Z

src/goto-programs/loop_ids.h

+  loop_idt() : function_id(""), loop_number(-1)
+  {
+  }
+
+  loop_idt(const loop_idt &other)
+    : function_id(other.function_id), loop_number(other.loop_number)
+  {
+  }
+


Which piece of code needs these? At least the first one looks a bit dangerous and I wonder whether we might instead need to go for a bigger change where loop_number is made optionalt<unsigned int>.

When computing cause loop, we want a value to indicate that no cause loop found, which means that the violation is nothing about loop invariants. Therefore, synthesizing loop invariant doesn't help in such case.
Also, now loop_number is a member variable for every instruction. However, it make no sense to have a loop number for an instruction not in a loop. So I agree that changing loop_number from unsigned int to optionalt<unsigned int> is a good idea.
I will make the change for loop_idt in this PR. And then change other loop_number (those in goto_programt::instructiont) with another PR. What do you think?

Do return an optionalt<loop_idt> when looking for a cause.
Please don't change goto_programt::instructiont. It's deliberate that the loop_number in the instruction is undefined unless the instruction forms a loop.

src/goto-instrument/synthesizer/cegis_verifier.cpp

tautschnig

My apologies, my comments are a bit all-over-the-place and very surface level: My biggest concern is that we're building what could (should?) be a free-standing tool into goto-instrument. I really think that this deserves to be a tool of its own. Maybe you really want to copy #6526, which should be the bare-bones for a new tool.

src/goto-instrument/synthesizer/cegis_verifier.cpp

src/goto-instrument/CMakeLists.txt

src/goto-instrument/contracts/contracts.cpp

src/goto-instrument/havoc_utils.cpp

src/goto-instrument/synthesizer/enumerative_loop_invariant_synthesizer.cpp

src/goto-instrument/synthesizer/synthesizer_utils.cpp

qinheping · 2022-12-05T20:44:28Z

My apologies, my comments are a bit all-over-the-place and very surface level: My biggest concern is that we're building what could (should?) be a free-standing tool into goto-instrument. I really think that this deserves to be a tool of its own. Maybe you really want to copy #6526, which should be the bare-bones for a new tool.

It sounds great to me. I will create another PR to initialize the new tool goto-synthesizer. At the same time, let's keep pushing this PR to be merged. Once the PR for the new tool is approved, I will migrate the loop-contract synthesizer to the new tool.

Implement the functionality described below. Motivation --- This loop invariant synthesizer use the idea counter-example-guided synthesis (CEGIS) to synthesize loop invariants for programs with only checks instrumented by `goto-instrument` with flag `--pointer-check`. This PR contain the driver of the synthesizer and the verifier we use to check invariant candidates. Verifier --- The verifier take as input a goto program with pointer checks and a map from loop id to loop invariant candidates. It first annotate and apply the loop invariant candidates into the goto model; and then simulate the CBMC api to verify the instrumented goto program. If there are some violations---loop invariants are not inductive, or some pointer checks fail---, it record valuation from trace generated by the back end to construct a formatted counterexample `cext`. Counterexample --- A counterexample `cext` record valuations of variables in the trace, and other information about the violation. The valuation we record including 1. set of live variables upon the entry of the loop. 2. the havoced value of all primitive-typed variables; 3. the havoced offset and the object size of all pointer-typed variables; 4. history values of 2 and 3. The valuations will be used as true positive (history values) and true negative (havcoed valuation) to filter out bad invariant clause with the idea of the Daikon invariant detector in a following PR. However, in this PR we only construct the valuation but not actually use them. Synthesizer --- Loop invariants we synthesize are of the form `` (in_clause || !guard) && (!guard -> pos_clause)`` where `in_clause` and `out_clause` are predicates we store in two different map, and `guard` is the loop guard. The idea is that we separately synthesize the condition `in_clause` that should hold before the execution of the loop body, and condition `pos_clause` that should hold as post-condition of the loop. When the violation happen in the loop, we update `in_clause`. When the violation happen after the loop, we update `pos_clause`. When the invariant candidate it not inductive, we enumerate strengthening clause to make it inductive. To be more efficient, we choose different synthesis strategy for different type of violation * For out-of-boundary violation, we choose to use the violated predicate as the new clause, which is the WLP of the violation if the violation is only dependent on the havocing instruction. **TODO**: to make it more complete, we need to implement a WLP algorithm * For null-pointer violation, we choose `__CPROVER_same_object(ptr, __CPROVER_loop_entry(ptr))` as the new clause. That is, the havoced pointer should points to the same object at the start of every iteration of the loop. It is a heuristic choice. This can be extended with the idea of alias analysis if needed. * For invariant-not-preserved violation, we enumerate strengthening clauses and check that if the invariant will be inductive after strengthening (disjunction with the new clause). The synthesizer works as follow 1. initialize the two invariant maps, 2. verify the current candidates built from the two maps, _a. return the candidate if there is no violation _b. synthesize a new clause to resolve the **first** violation and add it to the correct map, 3. repeat 2. The flag `synthesize-loop-invariants` will also apply synthesized loop contracts.

codecov · 2022-12-06T07:29:25Z

Codecov Report

Base: 78.39% // Head: 78.41% // Increases project coverage by +0.01% 🎉

Coverage data is based on head (c63e328) compared to base (b03d870).
Patch coverage: 89.36% of modified lines in pull request are covered.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #7393      +/-   ##
===========================================
+ Coverage    78.39%   78.41%   +0.01%     
===========================================
  Files         1655     1657       +2     
  Lines       190281   190766     +485     
===========================================
+ Hits        149172   149590     +418     
- Misses       41109    41176      +67

Impacted Files	Coverage Δ
src/goto-instrument/contracts/utils.h	`100.00% <ø> (ø)`
src/goto-instrument/havoc_utils.h	`100.00% <ø> (ø)`
...nthesizer/enumerative_loop_invariant_synthesizer.h	`100.00% <ø> (ø)`
...rc/goto-instrument/synthesizer/expr_enumerator.cpp	`77.55% <50.00%> (+2.02%)`	⬆️
.../goto-instrument/goto_instrument_parse_options.cpp	`71.44% <81.81%> (-0.04%)`	⬇️
src/goto-instrument/synthesizer/cegis_verifier.cpp	`84.93% <84.93%> (ø)`
...hesizer/enumerative_loop_invariant_synthesizer.cpp	`90.72% <91.24%> (+3.22%)`	⬆️
src/analyses/dependence_graph.cpp	`91.57% <100.00%> (+1.27%)`	⬆️
src/analyses/dependence_graph.h	`85.89% <100.00%> (+0.76%)`	⬆️
src/goto-instrument/contracts/contracts.cpp	`95.40% <100.00%> (+<0.01%)`	⬆️
... and 29 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

tautschnig · 2022-12-12T15:54:18Z

It sounds great to me. I will create another PR to initialize the new tool goto-synthesizer. At the same time, let's keep pushing this PR to be merged. Once the PR for the new tool is approved, I will migrate the loop-contract synthesizer to the new tool.

Now that this has happened: can this PR directly go in the new tool? It will avoid adding a whole bunch of dependencies into goto-instrument.

qinheping · 2022-12-12T17:02:05Z

It sounds great to me. I will create another PR to initialize the new tool goto-synthesizer. At the same time, let's keep pushing this PR to be merged. Once the PR for the new tool is approved, I will migrate the loop-contract synthesizer to the new tool.

Now that this has happened: can this PR directly go in the new tool? It will avoid adding a whole bunch of dependencies into goto-instrument.

Agreed. I created two PRs:
#7429 for Migrating the current synthesizer interface and the expression enumerator from goto-instrument into goto-synthesizer.
#7430 that make the same changes as in this PR but in goto-synthesizer.

qinheping added aws Bugs or features of importance to AWS CBMC users aws-medium Code Contracts Function and loop contracts Synthesis Invariant synthesis labels Nov 28, 2022

qinheping self-assigned this Nov 28, 2022

qinheping requested review from a team, tautschnig, feliperodri, remi-delmas-3000, martin-cs, chris-ryder, peterschrammel and kroening as code owners November 28, 2022 07:38

qinheping force-pushed the loop_invariant_synthesis branch 2 times, most recently from b91b168 to 0b0eb68 Compare November 28, 2022 08:30

tautschnig reviewed Nov 28, 2022

View reviewed changes

src/goto-instrument/synthesizer/cegis_verifier.cpp Outdated Show resolved Hide resolved

tautschnig reviewed Nov 28, 2022

View reviewed changes

src/goto-instrument/synthesizer/cegis_verifier.cpp Outdated Show resolved Hide resolved

tautschnig reviewed Nov 28, 2022

View reviewed changes

feliperodri assigned remi-delmas-3000 and feliperodri Nov 28, 2022

qinheping force-pushed the loop_invariant_synthesis branch 2 times, most recently from 4104af1 to 36412d2 Compare December 5, 2022 20:42

qinheping requested a review from tautschnig December 5, 2022 20:49

qinheping force-pushed the loop_invariant_synthesis branch from 36412d2 to 8fa325f Compare December 5, 2022 21:01

Merge branch 'diffblue:develop' into loop_invariant_synthesis

45d3399

qinheping force-pushed the loop_invariant_synthesis branch 2 times, most recently from 5658e34 to 0ad2852 Compare December 6, 2022 06:54

Disallow enumerate equal expr between sub-exprs with different types.

4b558d8

qinheping force-pushed the loop_invariant_synthesis branch from 0ad2852 to 4b558d8 Compare December 6, 2022 17:28

qinheping added aws-high and removed aws-medium labels Dec 8, 2022

Merge branch 'diffblue:develop' into loop_invariant_synthesis

fdea352

Add goto-convert before verifying candidate

c63e328

qinheping force-pushed the loop_invariant_synthesis branch from cb5463f to c63e328 Compare December 12, 2022 16:14

qinheping added the do not merge label Dec 12, 2022

qinheping mentioned this pull request Dec 12, 2022

SYNTHESIZER: Add enumerative loop invariant synthesizer #7430

Merged

5 tasks

qinheping closed this Dec 15, 2022

		This test shows that loop invariant with form of range predicates can be correctly
		synthesized for programs with only pointer checks but no other assertions.

CONTRACTS: Add enumerative loop invariant synthesizer #7393

CONTRACTS: Add enumerative loop invariant synthesizer #7393

Uh oh!

Conversation

qinheping commented Nov 28, 2022

Motivation

Verifier

Counterexample

Synthesizer

Uh oh!

tautschnig commented Nov 28, 2022

Uh oh!

tautschnig Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

qinheping Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

tautschnig Nov 28, 2022

Choose a reason for hiding this comment

Uh oh!

qinheping Nov 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kroening Dec 15, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tautschnig left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

qinheping commented Dec 5, 2022

Uh oh!

codecov bot commented Dec 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tautschnig commented Dec 12, 2022

Uh oh!

qinheping commented Dec 12, 2022

Uh oh!

Uh oh!

qinheping Nov 28, 2022 •

edited

Loading

tautschnig left a comment •

edited

Loading

codecov bot commented Dec 6, 2022 •

edited

Loading