-
Notifications
You must be signed in to change notification settings - Fork 273
CONTRACTS: Add enumerative loop invariant synthesizer #7393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b91b168
to
0b0eb68
Compare
You have an extensive PR description, which is really nice. Would you mind also including that in the commit message, or at least an executive summary thereof? |
This test shows that loop invariant with form of range predicates can be correctly | ||
synthesized for programs with only pointer checks but no other assertions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what happens when there are additional assertions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The synthesizer tries to solve violation one by one. If there are other violations beyond pointer checks and invariant checks, it will throw an exception saying the type of violation is unsupported. My plan is focusing on pointer checks in this PR, and create another PR to add the enumerating-and-check mode that keeps enumerating until all checks pass for other type of violations.
src/goto-programs/loop_ids.h
Outdated
loop_idt() : function_id(""), loop_number(-1) | ||
{ | ||
} | ||
|
||
loop_idt(const loop_idt &other) | ||
: function_id(other.function_id), loop_number(other.loop_number) | ||
{ | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which piece of code needs these? At least the first one looks a bit dangerous and I wonder whether we might instead need to go for a bigger change where loop_number
is made optionalt<unsigned int>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When computing cause loop, we want a value to indicate that no cause loop found, which means that the violation is nothing about loop invariants. Therefore, synthesizing loop invariant doesn't help in such case.
Also, now loop_number
is a member variable for every instruction. However, it make no sense to have a loop number for an instruction not in a loop. So I agree that changing loop_number
from unsigned int
to optionalt<unsigned int>
is a good idea.
I will make the change for loop_idt
in this PR. And then change other loop_number
(those in goto_programt::instructiont
) with another PR. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do return an optionalt<loop_idt>
when looking for a cause.
Please don't change goto_programt::instructiont
. It's deliberate that the loop_number
in the instruction is undefined unless the instruction forms a loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies, my comments are a bit all-over-the-place and very surface level: My biggest concern is that we're building what could (should?) be a free-standing tool into goto-instrument
. I really think that this deserves to be a tool of its own. Maybe you really want to copy #6526, which should be the bare-bones for a new tool.
src/goto-instrument/synthesizer/enumerative_loop_invariant_synthesizer.cpp
Outdated
Show resolved
Hide resolved
src/goto-instrument/synthesizer/enumerative_loop_invariant_synthesizer.cpp
Outdated
Show resolved
Hide resolved
src/goto-instrument/synthesizer/enumerative_loop_invariant_synthesizer.cpp
Outdated
Show resolved
Hide resolved
4104af1
to
36412d2
Compare
It sounds great to me. I will create another PR to initialize the new tool |
Implement the functionality described below. Motivation --- This loop invariant synthesizer use the idea counter-example-guided synthesis (CEGIS) to synthesize loop invariants for programs with only checks instrumented by `goto-instrument` with flag `--pointer-check`. This PR contain the driver of the synthesizer and the verifier we use to check invariant candidates. Verifier --- The verifier take as input a goto program with pointer checks and a map from loop id to loop invariant candidates. It first annotate and apply the loop invariant candidates into the goto model; and then simulate the CBMC api to verify the instrumented goto program. If there are some violations---loop invariants are not inductive, or some pointer checks fail---, it record valuation from trace generated by the back end to construct a formatted counterexample `cext`. Counterexample --- A counterexample `cext` record valuations of variables in the trace, and other information about the violation. The valuation we record including 1. set of live variables upon the entry of the loop. 2. the havoced value of all primitive-typed variables; 3. the havoced offset and the object size of all pointer-typed variables; 4. history values of 2 and 3. The valuations will be used as true positive (history values) and true negative (havcoed valuation) to filter out bad invariant clause with the idea of the Daikon invariant detector in a following PR. However, in this PR we only construct the valuation but not actually use them. Synthesizer --- Loop invariants we synthesize are of the form `` (in_clause || !guard) && (!guard -> pos_clause)`` where `in_clause` and `out_clause` are predicates we store in two different map, and `guard` is the loop guard. The idea is that we separately synthesize the condition `in_clause` that should hold before the execution of the loop body, and condition `pos_clause` that should hold as post-condition of the loop. When the violation happen in the loop, we update `in_clause`. When the violation happen after the loop, we update `pos_clause`. When the invariant candidate it not inductive, we enumerate strengthening clause to make it inductive. To be more efficient, we choose different synthesis strategy for different type of violation * For out-of-boundary violation, we choose to use the violated predicate as the new clause, which is the WLP of the violation if the violation is only dependent on the havocing instruction. **TODO**: to make it more complete, we need to implement a WLP algorithm * For null-pointer violation, we choose `__CPROVER_same_object(ptr, __CPROVER_loop_entry(ptr))` as the new clause. That is, the havoced pointer should points to the same object at the start of every iteration of the loop. It is a heuristic choice. This can be extended with the idea of alias analysis if needed. * For invariant-not-preserved violation, we enumerate strengthening clauses and check that if the invariant will be inductive after strengthening (disjunction with the new clause). The synthesizer works as follow 1. initialize the two invariant maps, 2. verify the current candidates built from the two maps, _a. return the candidate if there is no violation _b. synthesize a new clause to resolve the **first** violation and add it to the correct map, 3. repeat 2. The flag `synthesize-loop-invariants` will also apply synthesized loop contracts.
36412d2
to
8fa325f
Compare
5658e34
to
0ad2852
Compare
Codecov ReportBase: 78.39% // Head: 78.41% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## develop #7393 +/- ##
===========================================
+ Coverage 78.39% 78.41% +0.01%
===========================================
Files 1655 1657 +2
Lines 190281 190766 +485
===========================================
+ Hits 149172 149590 +418
- Misses 41109 41176 +67
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
0ad2852
to
4b558d8
Compare
Now that this has happened: can this PR directly go in the new tool? It will avoid adding a whole bunch of dependencies into goto-instrument. |
cb5463f
to
c63e328
Compare
Agreed. I created two PRs: |
Implement the functionality described below.
Motivation
This loop invariant synthesizer use the idea counter-example-guided synthesis (CEGIS) to synthesize loop invariants for programs with only checks instrumented by
goto-instrument
with flag--pointer-check
.This PR contain the driver of the synthesizer and the verifier we use to check invariant candidates.
Verifier
The verifier take as input a goto program with pointer checks and a map from loop id to loop invariant candidates. It first annotate and apply the loop invariant candidates into the goto model; and then simulate the CBMC api to verify the instrumented goto program. If there are some violations---loop invariants are not inductive, or some pointer checks fail---, it record valuation from trace generated by the back end to construct a formatted counterexample
cext
.Counterexample
A counterexample
cext
record valuations of variables in the trace, and other information about the violation. The valuation we record includingThe valuations will be used as true positive (history values) and true negative (havcoed valuation) to filter out bad invariant clause with the idea of the Daikon invariant detector in a following PR. However, in this PR we only construct the valuation but not actually use them.
Synthesizer
Loop invariants we synthesize are of the form
(in_clause || !guard) && (!guard -> pos_clause)
where
in_clause
andout_clause
are predicates we store in two different map, andguard
is the loop guard. The idea is that we separately synthesize the conditionin_clause
that should hold before the execution of the loop body, and conditionpos_clause
that should hold as post-condition of the loop.When the violation happen in the loop, we update
in_clause
. When the violation happen after the loop, we updatepos_clause
. When the invariant candidate it not inductive, we enumerate strengthening clause to make it inductive.To be more efficient, we choose different synthesis strategy for different type of violation
__CPROVER_same_object(ptr, __CPROVER_loop_entry(ptr))
as the new clause. That is, the havoced pointer should points to the same object at the start of every iteration of the loop. It is a heuristic choice. This can be extended with the idea of alias analysis if needed.The synthesizer works as follow
_a. return the candidate if there is no violation
_b. synthesize a new clause to resolve the first violation and add it to the correct map,
The flag
synthesize-loop-invariants
will also apply synthesized loop contracts.