-
Notifications
You must be signed in to change notification settings - Fork 273
SYNTHESIZER: Add enumerative loop invariant synthesizer #7430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportBase: 78.38% // Head: 78.46% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## develop #7430 +/- ##
===========================================
+ Coverage 78.38% 78.46% +0.08%
===========================================
Files 1659 1662 +3
Lines 190372 190904 +532
===========================================
+ Hits 149217 149794 +577
+ Misses 41155 41110 -45
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
1ee20cf
to
d046d11
Compare
Does this depend on PR7127 getting merged? |
No. It only depend on #7429 which has been merged. |
d046d11
to
f5629d6
Compare
regression/goto-synthesizer/loop_contracts_synthesis_02/test.desc
Outdated
Show resolved
Hide resolved
regression/goto-synthesizer/loop_contracts_synthesis_03/test.desc
Outdated
Show resolved
Hide resolved
cext() | ||
{ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Safe to omit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this still needs to be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to omit it. But failed to compile with error
/Users/qinhh/Repos/CBMC/clone_2/src/goto-synthesizer/cegis_verifier.h:90:3: error: constructor for 'cegis_verifiert' must explicitly initialize the member 'return_cex' which does not have a default constructor
cegis_verifiert(
^
/Users/qinhh/Repos/CBMC/clone_2/src/goto-synthesizer/cegis_verifier.h:105:8: note: member is declared here
cext return_cex;
^
/Users/qinhh/Repos/CBMC/clone_2/src/goto-synthesizer/cegis_verifier.h:25:7: note: 'cext' declared here
class cext
^
1 error generated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the use should be fixed? A default-constructed cext
seems ill-formed. Perhaps that return_cex
class member shouldn't exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, the return value of verify()
is now optionalt<cext>
---it return a counterexample if there is some violation.
096e2b4
to
706be24
Compare
src/goto-programs/goto_program.h
Outdated
/// We record the original loop number for some instrumended instructions | ||
/// so that we know which loop they are transformed from. | ||
optionalt<unsigned> original_loop_number; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please drop, this is not needed, and costs extra memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I plan to instead define ID_original_loop_number
and store the original loop number in the field ID_original_loop_number
of the lhs exprt
of instrumented assignments such as loop-havoc. Does it sound OK to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, please use a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I will store the map from instrumented instructions to their original loop number in code_contractst
. And the functions I used to decided whether an instruction is a loop-havoc or instrumented instruction will be no longer needed with such map.
4730446
to
2bc4a55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a blocker just to make sure we don't end up inadvertently merging this: must not be merged until the change in goto_program.h that was added after my earlier approval is removed (cf. conversation with Daniel).
e599b85
to
17579cc
Compare
"// clang-format off\n" | ||
"void __breakpoint(int val);\n" | ||
"void __cdp(unsigned int coproc, unsigned int ops, unsigned int regs);\n" | ||
"void __clrex(void);\n" | ||
"unsigned char __clz(unsigned int val);\n" | ||
"unsigned int __current_pc(void);\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something has gone wrong here, this is an auto-generated file. (There are a few others of this kind added in the latest commit.)
a536e77
to
a84a654
Compare
I added another commit to remove undo the changes in |
Please squash the commits! |
06d6a89
to
0b52b56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @qinheping , thanks for this; it is a really great piece of work and it is long-overdue that we have this capability. Given that, I feel a bit bad critiquing it. I have only looked at the code in analyses
and goto-instrument
(but not contracts).
As I understand it, the functionality you want for dependence-graphs is to be able to do reachability between two program locations. Is this correct? If so, dependence_grapht
inherits from grapht
so you should be able to use the graph functions directly. I would far rather we added an can_reach
method to grapht
than custom searching here because it decouples the computation of the dependence graph and the use of it. There is another set of dependency graph code analyses/variable-sensitivity/variable_sensitivity_dependence_graph.{h,cpp}
which is more configurable and hopefully more precise, it would be nice to be able to switch between them if there are issues.
Perhaps I have misunderstood as your is_flow_dependent
seems to consider control and then data, but doesn't then check control dependency again? If you want this behaviour then maybe use the graph visitor rather than recursing?
Also, non-blocking thoughts:
- The commit message is great; can that make it in to the documentation?
- I appreciate that in this case it is difficult but it would help reviewing in the future if things could be broken out a bit more.
src/goto-instrument/havoc_utils.cpp
Outdated
@@ -62,4 +70,6 @@ void havoc_utilst::append_scalar_havoc_code_for_expr( | |||
side_effect_expr_nondett rhs(expr.type(), location); | |||
dest.add(goto_programt::make_assignment( | |||
code_assignt{expr, std::move(rhs), location}, location)); | |||
dest.instructions.back().source_location_nonconst().set_comment( | |||
IS_LOOP_HAVOC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate that you want to put some kind of tag in the program but this feels fragile and ad-hoc. Would it be possible to use a function call to end_of_loop_havoc_block
or a LOCATION
or even an OTHER
instruction? Without knowing the wider context (hard to get from such a big commit) I can't really say what is best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed is_loop_havoc
function and related codes.
Instead, I changed the structure of transformed loop from
havoc(write_set)
to
DECL __is_in_loop_havoc_block
ASSIGN __is_in_loop_havoc_block = true
havoc(write_set)
ASSIGN __is_in_loop_havoc_block = false
Now, after applying a loop contract, code_cotractst
will scan instrumented instructions to find all assignments in loop-havoc blocks and store them into a set loop_havoc_set
.
@tautschnig : We cannot decide whether an assignment is loop havoc from its instruction structure. So I change the structure of transformed loops to better identify loop havoc.
f7e1845
to
8df49ed
Compare
Thank you! I removed all the changes to |
8df49ed
to
5a7873d
Compare
Implement the functionality described below. Motivation --- This loop invariant synthesizer use the idea counter-example-guided synthesis (CEGIS) to synthesize loop invariants for programs with only checks instrumented by `goto-instrument` with flag `--pointer-check`. This PR contain the driver of the synthesizer and the verifier we use to check invariant candidates. Verifier --- The verifier take as input a goto program with pointer checks and a map from loop id to loop invariant candidates. It first annotate and apply the loop invariant candidates into the goto model; and then simulate the CBMC api to verify the instrumented goto program. If there are some violations---loop invariants are not inductive, or some pointer checks fail---, it record valuation from trace generated by the back end to construct a formatted counterexample `cext`. Counterexample --- A counterexample `cext` record valuations of variables in the trace, and other information about the violation. The valuation we record including 1. set of live variables upon the entry of the loop. 2. the havoced value of all primitive-typed variables; 3. the havoced offset and the object size of all pointer-typed variables; 4. history values of 2 and 3. The valuations will be used as true positive (history values) and true negative (havcoed valuation) to filter out bad invariant clause with the idea of the Daikon invariant detector in a following PR. However, in this PR we only construct the valuation but not actually use them. Synthesizer --- Loop invariants we synthesize are of the form `` (in_clause || !guard) && (!guard -> pos_clause)`` where `in_clause` and `out_clause` are predicates we store in two different map, and `guard` is the loop guard. The idea is that we separately synthesize the condition `in_clause` that should hold before the execution of the loop body, and condition `pos_clause` that should hold as post-condition of the loop. When the violation happen in the loop, we update `in_clause`. When the violation happen after the loop, we update `pos_clause`. When the invariant candidate it not inductive, we enumerate strengthening clause to make it inductive. To be more efficient, we choose different synthesis strategy for different type of violation * For out-of-boundary violation, we choose to use the violated predicate as the new clause, which is the WLP of the violation if the violation is only dependent on the havocing instruction. **TODO**: to make it more complete, we need to implement a WLP algorithm * For null-pointer violation, we choose `__CPROVER_same_object(ptr, __CPROVER_loop_entry(ptr))` as the new clause. That is, the havoced pointer should points to the same object at the start of every iteration of the loop. It is a heuristic choice. This can be extended with the idea of alias analysis if needed. * For invariant-not-preserved violation, we enumerate strengthening clauses and check that if the invariant will be inductive after strengthening (disjunction with the new clause). The synthesizer works as follow 1. initialize the two invariant maps, 2. verify the current candidates built from the two maps, _a. return the candidate if there is no violation _b. synthesize a new clause to resolve the **first** violation and add it to the correct map, 3. repeat 2.
5a7873d
to
6827ce5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qinheping Thank you for the quick and accommodating changes. I am approving this although I don't think it touches anywhere I am a maintainer any more. Thanks for all your work.
Dependent on #7429.
This PR is the same as in #7393. But add the enumerative synthesizer into
goto-synthesizer
instead ofgoto-instrument
.Implement the functionality described below.
Motivation
This loop invariant synthesizer use the idea counter-example-guided synthesis (CEGIS) to synthesize loop invariants for programs with only checks instrumented by
goto-instrument
with flag--pointer-check
.This PR contain the driver of the synthesizer and the verifier we use to check invariant candidates.
Verifier
The verifier take as input a goto program with pointer checks and a map from loop id to loop invariant candidates. It first annotate and apply the loop invariant candidates into the goto model; and then simulate the CBMC api to verify the instrumented goto program. If there are some violations---loop invariants are not inductive, or some pointer checks fail---, it record valuation from trace generated by the back end to construct a formatted counterexample
cext
.Counterexample
A counterexample
cext
record valuations of variables in the trace, and other information about the violation. The valuation we record includingThe valuations will be used as true positive (history values) and true negative (havcoed valuation) to filter out bad invariant clause with the idea of the Daikon invariant detector in a following PR. However, in this PR we only construct the valuation but not actually use them.
Synthesizer
Loop invariants we synthesize are of the form
(in_clause || !guard) && (!guard -> pos_clause)
where
in_clause
andout_clause
are predicates we store in two different map, andguard
is the loop guard. The idea is that we separately synthesize the conditionin_clause
that should hold before the execution of the loop body, and conditionpos_clause
that should hold as post-condition of the loop.When the violation happen in the loop, we update
in_clause
. When the violation happen after the loop, we updatepos_clause
. When the invariant candidate it not inductive, we enumerate strengthening clause to make it inductive.To be more efficient, we choose different synthesis strategy for different type of violation
__CPROVER_same_object(ptr, __CPROVER_loop_entry(ptr))
as the new clause. That is, the havoced pointer should points to the same object at the start of every iteration of the loop. It is a heuristic choice. This can be extended with the idea of alias analysis if needed.The synthesizer works as follow
_a. return the candidate if there is no violation
_b. synthesize a new clause to resolve the first violation and add it to the correct map,