Skip to content

Conversation

@MikaelMayer
Copy link
Collaborator

This PR is the companion of dafny-lang/dafny#3324
We established numerous times this year that the wording "might not hold" still incorrectly provides the feeling that the postcondition is incorrect, when the real problem is often that some proof hints are missings, which is rather what we want our users to focus on.
So, rather than focusing on whether assertions hold or not, this PR fixes the message by focusing on the fact that the verifier was not able to prove the assertion. Proof now becomes a first-class goal.

There was the choice of starting all errors by "Could not prove", but then the first 3 words of every error message are never informative, which is not very friendly. The passive voice solves this problem and is less of a disruption in the errors messages.

This PR is the companion of dafny-lang/dafny#3324
We established numerous times this year that the wording "might not hold" still incorrectly provides the feeling that the postcondition is incorrect. Rather than focusing on whether assertions hold or not,
this PR fixes the status by focusing on the fact that the verifier was not able to prove the assertion.
public class AssertionDescription : ProofObligationDescription
{
public override string SuccessDescription => "This assertion holds.";
public override string SuccessDescription => "this assertion holds";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going after "proven" to stress out this is about proofs, then why not also "proven" instead of "holds"?
Like "Boogie proved this assertion".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes for other such patterns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Note that the success message is for now only displayed in rare occasions, in some toolings and while hovering Dafny programs in VSCode. Since there is no call to action, "holds" is the same as "was proven", but it's shorter.

Execution trace:
fakeFilename1(3,3): anon0
fakeFilename1(8,3): Error: This assertion might not hold.
fakeFilename1(8,3): Error: this assertion could not be proven
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also wondering if active voice would be nicer in all of these. So something like: "Boogie could not prove this assertion". Personally, I would prefer such active voice to current passive voice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, but after looking at the error messages, the first 4 words of "Boogie could not prove " are useless to know what the error is about, whereas by keeping this assertion as the subject, it's much clearer what's in the focus there.

@atomb
Copy link
Collaborator

atomb commented Jan 6, 2023

This looks good overall, but it looks like the tests still need to be updated to account for the output changes.

@rakamaric
Copy link
Contributor

As a side note, this will be a breaking change for (some) downstream tools. For example, SMACK relies on these messages to figure out the verification status. Not that we should not make this update, but just wanted to mention this.

@MikaelMayer
Copy link
Collaborator Author

As a side note, this will be a breaking change for (some) downstream tools. For example, SMACK relies on these messages to figure out the verification status. Not that we should not make this update, but just wanted to mention this.

Indeed, this is also a breaking change for Dafny. The only reason I'm pushing for this is that last year we went from "assertion violation" to "assertion might not hold", which was already a plus (and we did not get push back), yet, new users are still puzzled and first try to figure out why their assertion is wrong when all they need is figure out how to actually prove it.
So I think this change is going to be extremely useful to avoid this problem.

@shazqadeer
Copy link
Contributor

@MikaelMayer : I don't quite see how changing the words from "does not hold" to "could not prove" or other such variations will actually help a Boogie or Dafny user decide on the next step. I do agree that "could not prove" is more accurate so I do not object to making this change.

@bkragl
Copy link
Member

bkragl commented Jan 10, 2023

Bikeshedding here, but I think that "proved" is better than "proven" (although both are considered correct). We could consult a particular style guide, but let's use metrics instead: one less syllable to pronounce and more Google hits :)

@MikaelMayer
Copy link
Collaborator Author

In the last commit, I'm fixing the last two CI errors. One is puzzling to me.
9eb5120
An error message is displayed 10 lines earlier on another assertion.
I don't understand how this is possible because I only changed error messages. I'm investigating though.

@MikaelMayer
Copy link
Collaborator Author

In the last commit, I'm fixing the last two CI errors. One is puzzling to me. 9eb5120 An error message is displayed 10 lines earlier on another assertion. I don't understand how this is possible because I only changed error messages. I'm investigating though.

Ok it looks like the test is a parallel test, so the error isn't one, it's more a stability one. It looks like it's deterministic though so I think changing the test output makes sense.

@MikaelMayer MikaelMayer requested a review from rakamaric January 10, 2023 22:29

public override string FailureDescription =>
"This loop invariant might not be maintained by the loop.";
"this loop invariant might not be maintained by the loop";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message still contains the "might not" phrasing, as opposed to "could not be proved". Unfortunately, using that wording makes it pretty awkward: "this loop invariant could not be proved to be maintained by the loop".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about: "this invariant could not be proved within the loop"?
Or, a shorter version of yours, "this invariant could not be proved to be maintained by the loop"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current PR, I used the second one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like that one, too.

@atomb
Copy link
Collaborator

atomb commented Jan 11, 2023

It looks like there are two test files remaining to update.

Copy link
Collaborator

@atomb atomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@keyboardDrummer
Copy link
Collaborator

Looks good to me too but let's get a few more approvals on this

@MikaelMayer
Copy link
Collaborator Author

@MikaelMayer : I don't quite see how changing the words from "does not hold" to "could not prove" or other such variations will actually help a Boogie or Dafny user decide on the next step. I do agree that "could not prove" is more accurate so I do not object to making this change.

I'm not sure a lot about Boogie users, but I can tell you more about the Dafny user experience. As you outlined "could not prove" is more accurate, and that's already the first step.
What happened with new Dafny users is that, as soon as they encountered "might not hold", they immediately understood "is probably wrong". This has several possible causes:

  • Users without verification background usually don't know what "false negative" is.
  • In most compiler tools used by software engineers, if an error is reported, it is accepted that the user needs to fix something, that there is something intrinsically wrong with their code, they read the doc and the error, and they fix it. Their code was broken, now it can be compiled.
  • Even type annotations are perceived as a necessity in most languages that do not have good type inference, and error messages are clear "X is used as Y but its inferred type was Z.", so users know that they need to add type annotation and even fix their code, no big deal.

Proofs are very much like type annotations, except that the type-checking system is incomplete.
So imagine, if users were presented an error on a type annotation (expression: X) like this

"The provided type X of expression might be wrong"

The first idea users would have is that, since the compiler says so, : X is the wrong annotation and the type of the expression is different. But it wouldn't come to mind that "might" indicates that it's actually correct and it requires hints, because adding hints is not something users do for type checking besides obvious places.

So when we provide an error message like "might not hold", we face the same problem. As users aren't used to provide hints, they might erroneously adhere to the belief that the compiler is clever and just being nice and that something is indeed wrong.

With "could not prove", not only it is more accurate, but the subject of the "prove" is Dafny itself. It says nothing about the proposition being false, only that it could not find a proof. This is the implicit invitation we want Dafny users to receive.

Hope it helps ! Thanks for the review.

@shazqadeer
Copy link
Contributor

@MikaelMayer : I appreciate the detailed explanation. Thanks.

@shazqadeer shazqadeer merged commit 4166fbb into master Jan 14, 2023
@MikaelMayer MikaelMayer deleted the fix-3216-could-not-prove branch January 17, 2023 15:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants