Skip to content

Conversation

@RustanLeino
Copy link
Collaborator

@RustanLeino RustanLeino commented Sep 26, 2025

This PR fixes a soundness issue in automatically generated induction hypotheses.

Fixes #6366

Description of the problem and fix

The problem reported in #6366 was due to a missing antecedent in automatically generated induction hypotheses. For a lemma

lemma L(x: X)
  requires Pre(x)
  ensures Post(x)
  decreases D(x)

the induction hypothesis was previously generated as

forall y ::
  Pre#CanCall(y) &&
  (Is(y, X) &&
   Pre(y) &&
   D(y) < D(x)
   ==>
   Post(y))

This is incorrect, because it gives access to Pre#CanCall(y) even if Is(y, X) does not hold. It also gives access to Pre#CanCall(y) for any y, not just ys that are smaller than x; that seems unintentional, but not a soundness issue.

The proper way to formulate the induction hypothesis is

forall y ::
  Is(y, X) &&
  D(y) < D(x)
  ==>
    Pre#CanCall(y) &&
    (Pre(y) 
     ==>
     Post(y))

This PR makes this change.

Effects of the previous bug

This bug was present in every induction hypothesis generated by auto-induction and so could have had an effect on every lemma. To exploit the bug, the following conditions would have had to hold:

  • the Is(y, X) predicate for type X distinguishes y from other values of the representation type in Boogie, and
  • the "can call" predicate generated from the lemma's precondition leads to conflicting conclusions for different types, and
  • the underlying solver finds the proof.

In the reported bug, all 3 apply.

  • The Dafny type X for which the induction hypothesis was intended is seq<SeqNat>. It has the same underlying Boogie type as the Dafny type seq<Nat>, since both are essentially a "sequence of Box" in the Boogie encoding. If such a sequence is nonempty, then its first element has type SeqNat or Nat, respectively. The various constructors of a datatype are represented as functions with disjoint ranges. The axiomatization is actually stricter; it says that the datatype constructors of all types have disjoint ranges, and thus a SeqNat value is known to be different from a Nat value.

  • The "can call" predicate for each of the two LessThan functions implies the Is(_, X) predicate for its first argument. More precisely, it says that the argument is a value produced by the SeqNat or Nat constructor, respectively. (This is included in the "can call" predicate as a "free fact" for every single-constructor datatype.) By the previous bullet, this distinguishes a SeqNat from a Nat.

  • The lemma precondition LessThan(ss, 30) && LessThan(ss, 239) is cruicial, because its "can call" predicate looks like

    LessThan#CanCall(ss, 30) &&
    (LessThan(ss, 30) ==> LessThan(ss, 239))

    and this is then used in the induction hypothesis. The occurrence of the LessThan(ss, 30) is just in a disjunct, so it will not always be in the SMT solver's context. However, if the SMT solver happens to start its case study with the left disjunct, LessThan(ss, 30), then it will continue to instantiate quantifiers based on this term, even if the ss is not of the intended type. Lemma learning in SMT solvers today ignores the matching patterns of quantifiers, and therefore the solver will learn the proof of false on the left branch and will then apply that learnt lemma on the right branch, too. In contrast, if the SMT solver happens to start with the right disjunct, then it will not be able to complete the proof, so an error will be reported. For this reason (which can be called "lack of confluence"), small changes to the repro can mask the soundness issue (i.e., can cause the SMT solver not to find the proof).

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

MikaelMayer
MikaelMayer previously approved these changes Sep 26, 2025
requires xs.Cons? ==> !Below(xs.head, b)
ensures filter(g, Cons(b, append(xs, ys))) == filter(g, append(xs, Cons(b, ys)))
{
if key(g) == key(b) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, does that mean these lemmas relied on the soundness issue ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the one test file that was affected by the PR. One of the lemmas in this file stopped verifying. When I was going to look into the details, I noticed that the test was also using --manual-trigger, which is an option that tries to support tests that were written before Dafny did automatic trigger generation. So, I removed this option. Then, the failing lemma started verifying again. But two other lemmas in this file stopped verifying, so I wrote proofs for those.

In conclusion, the PR did affect this test file, but only with --manual-trigger. And since all the lemmas verify now, we at least know that the soundness bug did not cause a proof of a lemma that 's not true.

@RustanLeino RustanLeino merged commit 6d95522 into dafny-lang:master Sep 26, 2025
33 of 34 checks passed
@RustanLeino RustanLeino deleted the issue-6366 branch September 27, 2025 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sounds bug with aliased sequence of sequences

2 participants