Skip to content

Conversation

@MikaelMayer
Copy link
Member

@MikaelMayer MikaelMayer commented Jan 4, 2023

This PR fixes #3216 by standardizing Dafny's error messages to use "could not be proved" instead of "might not hold".

Motivation

The current "might not hold" wording has several issues:

  • Places doubt on the programmer rather than acknowledging Dafny's limitations
  • Causes user frustration when assertions are actually valid but Dafny can't prove them
  • Inconsistent with existing terminology - we already use "could not prove" elsewhere
  • Requires explanation whereas "could not prove" is self-evident

The new wording shifts focus from uncertainty about assertions to Dafny's inability to prove them, emphasizing the need to help Dafny with additional proof hints rather than questioning the assertion's validity.

Breaking Change Notice

⚠️ This PR intentionally changes error message text and will break external tools that parse Dafny output, including:

  • IDEs and editors with hardcoded error message patterns
  • CI/CD scripts that search for specific error text
  • Testing frameworks that depend on exact error messages
  • Third-party tools that analyze Dafny verification results

We consider this breaking change necessary and acceptable because:

  • Significantly improves user experience - reduces frustration with Dafny
  • More accurately reflects Dafny's semantics - focuses on proof limitations rather than assertion validity
  • Aligns with Dafny's user-friendly goals - makes error messages less discouraging
  • Provides long-term benefit - better terminology for all future users

External tool maintainers should update their parsing patterns accordingly. See docs/dev/news/3216.fix for release note details.

Error Message Changes

Core Pattern: "might not hold" → "could not be proved"

BEFORE: assertion might not hold
AFTER:  assertion could not be proved

BEFORE: This postcondition might not hold on a return path
AFTER:  this postcondition could not be proved on a return path

BEFORE: This loop invariant might not hold on entry  
AFTER:  this loop invariant could not be proved on entry

BEFORE: A precondition for this call might not hold
AFTER:  a precondition for this call could not be proved

BEFORE: function precondition might not hold
AFTER:  function precondition could not be proved

Terminology Standardization: "proven" → "proved"

BEFORE: this precondition could not be proven
AFTER:  this precondition could not be proved

Additional Patterns Standardized

BEFORE: might not be equivalent
AFTER:  could not be proved to be equivalent

BEFORE: might be null
AFTER:  could not be proved to be non-null

BEFORE: might be negative  
AFTER:  could not be proved to be non-negative

BEFORE: might refer to the same location
AFTER:  might refer to the same location (clearer positive framing)

BEFORE: might be bigger than every natural number
AFTER:  could not be proved to be a natural number

Double Negative Elimination

BEFORE: could not be proved to not underflow
AFTER:  could not be proved to remain above limit ordinal

BEFORE: could not be proved to not overflow  
AFTER:  could not be proved to remain below maximum

BEFORE: could not be proved to not be too large
AFTER:  could not be proved to be sufficiently small

Capitalization and Punctuation Consistency

BEFORE: This assertion holds.
AFTER:  this assertion holds

BEFORE: All preconditions hold for this call.
AFTER:  all preconditions hold for this call

Expected Outcome

  • More user-friendly error messages that focus on Dafny's limitations rather than questioning user code
  • Consistent terminology throughout the codebase
  • Improved readability through elimination of double negatives
  • No functional changes to verification behavior

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

This PR fixes #3216
It completely replaces the wording ".... might not hold" by "could not prove..."
@MikaelMayer MikaelMayer requested a review from atomb January 4, 2023 22:13

public override string FailureDescription =>
customErrMsg ?? "assertion might not hold";
customErrMsg ?? "Could not prove assertion";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
customErrMsg ?? "Could not prove assertion";
customErrMsg ?? "could not prove assertion";

The convention in the rest of the messages is for them to start with a lower-case letter.


public override string FailureDescription =>
customErrMsg ?? "function precondition might not hold";
customErrMsg ?? "Could not prove function precondition";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
customErrMsg ?? "Could not prove function precondition";
customErrMsg ?? "could not prove function precondition";

The convention in the rest of the messages is for them to start with a lower-case letter.

### 25.6.1. Verification debugging when verification fails {#sec-verification-debugging}

Let's assume one assertion is failing ("assertion might not hold" or "postcondition might not hold"). What should you do next?
Let's assume one assertion is failing ("Could not prove assertion" or "Could not prove postcondition"). What should you do next?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same case distinction here.

@@ -1,5 +1,5 @@
---
title: "Error: function precondition might not hold"
title: "Error: Cannot prove function precondition"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Type parameter characteristics are discussed in [the reference manual](../DafnyRef/DafnyRef.html#sec-type-parameter-variance)

# "Error: function precondition might not hold"
# "Error: Cannot prove function precondition"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here.


Bpl.Ensures ens = new Bpl.Ensures(ForceCheckToken.Unwrap(tok), free, condition, comment);
ens.Description = new PODesc.AssertStatement(errorMessage ?? "This is the postcondition that could not be proven.");
ens.Description = new PODesc.AssertStatement(errorMessage ?? "this is the postcondition that could not be proven.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are going with the lower case style, then remove the ending periods as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is usually in a full sentence, though, of the form "Error: here is something in lowercase."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the style in dafny elsewhere is predominantly to start with lower-case and not have a period, except when there are clearly multiple sentences. I'm not wedded to this style, and am open to a team discussion, but I'd like it to be consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. I'm fine with leaving out the periods.

@MikaelMayer MikaelMayer requested review from atomb and davidcok January 5, 2023 16:25
atomb
atomb previously approved these changes Jan 6, 2023
Copy link
Member

@atomb atomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

MikaelMayer added a commit to boogie-org/boogie that referenced this pull request Jan 6, 2023
This PR is the companion of dafny-lang/dafny#3324
We established numerous times this year that the wording "might not hold" still incorrectly provides the feeling that the postcondition is incorrect. Rather than focusing on whether assertions hold or not,
this PR fixes the status by focusing on the fact that the verifier was not able to prove the assertion.
@MikaelMayer
Copy link
Member Author

Requires
boogie-org/boogie#676
to be merged

shazqadeer pushed a commit to boogie-org/boogie that referenced this pull request Jan 14, 2023
* Fix: Error wording closer to semantics
This PR is the companion of dafny-lang/dafny#3324
We established numerous times this year that the wording "might not hold" still incorrectly provides the feeling that the postcondition is incorrect. Rather than focusing on whether assertions hold or not,
this PR fixes the status by focusing on the fact that the verifier was not able to prove the assertion.

* Fixed tests

* proven => proved. Fixed tests

* Fixed one more en->ed

* Fixed a test

* Fixed all the boogie files with messages inline

* Fixed 3 more tests

* Fixed last 2 CI errors

* Support for the last two CI tests

* Wording for invariants as well
MikaelMayer and others added 11 commits March 3, 2023 15:02
- Added the trace of proof even in successes
- Replaced intermediate "Could not prove" and "Did prove" by "Inside " to indicate this is just a trace
- Use backticks to indicate quoted code
- Better error message instead of "error is impossible: This is the precondition that might not hold"
- First-class support of {:error} in hover messages.
# Conflicts:
#	Source/DafnyLanguageServer.Test/Lookup/HoverVerificationTest.cs
Found and fixed missed instances of 'proven' that should be 'proved'
for consistency with Boogie terminology:

- QuantificationNewSyntax.dfy: function precondition comment
- revealFunctions.dfy: ensures clause comment
- Updated both source and bin directory versions

This completes the terminology standardization effort.
MikaelMayer and others added 10 commits July 31, 2025 19:56
Updated documentation to use 'proved' instead of 'proven' to match
the standardized terminology used throughout the codebase and align
with Boogie's terminology.
The error message standardization is a breaking change that may affect
external tools parsing Dafny output. Updated the news entry to clearly
communicate this to users and tool maintainers.
- Fix capitalization: 'dafny' → 'Dafny' in UserGuide.md
- Update ERROR_SeqComp documentation to use consistent 'proved' terminology
- Restore accidentally modified symlinks to their original state
- Ensure documentation titles match actual error message content

These changes address out-of-scope modifications and terminology
inconsistencies flagged by the automated review process.
Empty log files should not be committed to version control as they
serve no purpose and clutter the repository. This file was accidentally
included and is unrelated to the error message standardization effort.
- Replace confusing double-negative logic with clearer positive framing
- Improve grammatical structure for better readability
- Maintain consistency with PR's standardization goals
- Keep original error message wording that Dafny actually produces
- Use proper Dafny terminology (fields, not elements)
- Update error message expectations to use 'could not be proved' wording
- Fix documentation typo: 'proven' -> 'proved'
- Align with PR #3324 error message standardization
Copy link
Member

@atomb atomb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your review.sh script from #6318 was very helpful!

It pointed out one inconsistency in the error messages, exhibited by modified.dfy, which I commented on below.

It also pointed out some slightly out-of-scope changes related to command-line handling in docs/DafnyRef/UserGuide.md, but I think they all seem reasonable and I don't object to including them.

modifiable.dfy(42,6): Error: modified field could not be proved to be in the current modifies clause
Asserted expression: c == s[1]
modifiable.dfy(50,5): Error: assignment might update an array element not in the enclosing context's modifies clause
modifiable.dfy(50,5): Error: modified field could not be proved to be in the current modifies clause
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new message talks about a field, but the location being updated is an array element, as the old description correctly stated.

- Change 'modified field' to use the actual description ('an object' vs 'an array element')
- Update modifiable.dfy.expect to reflect correct error messages
- Addresses review comment about incorrect 'field' terminology for array elements
- Field assignments: 'modified field could not be proved...'
- Array assignments: 'modified array location could not be proved...'
- Fixed at construction sites to avoid switch statement repetition
- Updated modifiable.dfy.expect accordingly
atomb
atomb previously approved these changes Aug 27, 2025
- Array assignments now show 'modified array location'
- Field assignments continue to show 'modified field'
- Updated Array.dfy.expect, LoopModifies.dfy.expect, TraitUsingParentMembers.dfy.expect
- Array assignment a[1] := 3 should show 'modified array location'
- Field assignments in forall: 'field' -> 'modified field could not be proved...'
- Array assignments in forall: 'array location' -> 'modified array location could not be proved...'
- This fixes the remaining CI failures in integration test partitions 1 and 3
- Trigger CI to test final fixes
- 18/20 tests now passing (90% success rate)
- Only integration-tests partition 3 still failing
- This file doesn't belong in the repository
- Line 332 is 'x :| assume x == 10;' where x is a class field
- Should be 'modified field' not 'modified array location'
- This fixes the last failing integration test
@MikaelMayer MikaelMayer merged commit 660219c into master Aug 28, 2025
21 checks passed
@MikaelMayer MikaelMayer deleted the fix-3216-could-not-prove branch August 28, 2025 20:28
MikaelMayer added a commit that referenced this pull request Sep 2, 2025
Currently, reviewing long PRs, like PRs that changes multiple expect
files, is discouraging.
This PR automates the reviewing process by delegating some portion of
reviewing, i.e. low-level reviewing, to an agent such as Amazon Q or
Claude Code

### What was changed?
I added a script `./review.sh` that requires a PR number and
automatically performs a review of every change to see if they conform
to the PR description.
It clones the repository, lists all the files, and input maximum 500
lines of diff into a prompt that asks if the diff is consistent with
what we could have expected from the PR description.

Sample output while the script is running live.
```
=========================================
Current review concerns:
  (No issues found so far)

Progress:
  21/346 files reviewed so far
  Currently reviewing: docs/DafnyRef/UserGuide.5.expect
=========================================
```

### How has this been tested?

#### Exiting PR

I ran this script on #3324 to
ensure the PR description is consistent with the changes. It already
detected the following issues:
- The PR description omitted to mention that the error messages could
break integration with tools depending on the error messages
- Two files were added that shouldn't have been committed
- One document line was not up to date.
- Capitalization of Dafny was inconsistent in the doc.

#### Self-test: Positive case
I also tested this script on this PR itself. Here was the output:
```
========================================
Current review concerns:
  (No issues found so far)

Progress:
  0/1 files reviewed so far
  Currently reviewing: Scripts/review.sh
=========================================
```

```
=========================================
📊 FINAL CODE REVIEW SUMMARY
=========================================

Files processed: 1
Files analyzed: 1
Files skipped: 0
Files passed: 1
Files failed: 0

✅ All analyzed files passed code review!
✅ PR #6318 looks good to merge
🎉 Code review complete!
```

#### Self-test: Negative case:

When deleting a line in the Makefile, we immediately get meaningful
feedback
```
=========================================
Current review concerns:
  Makefile: Makefile: DIR variable removed but still referenced in test-dafny and tests-verbose targets, causing undefined variable errors

Progress:
  1/2 files reviewed so far
  Currently reviewing: Scripts/review.sh
=========================================
```

and then when the analysis is finished, we see
```
=========================================
📊 FINAL CODE REVIEW SUMMARY
=========================================

Files processed: 2
Files analyzed: 2
Files skipped: 0
Files passed: 1
Files failed: 1

❌ Issues found in 1 file(s):

  Makefile: Makefile: DIR variable removed but still referenced in test-dafny and tests-verbose targets, causing undefined variable errors

🧹 Cleaning up successful review files...
💡 Review output files for failed files are available in the current directory
❌ Consider addressing these concerns before merging
🎉 Code review complete!

⚠️  IMPORTANT: This AI review is a tool to assist human reviewers.
   You are ultimately responsible for the actual quality of the review.
   Just because the AI doesn't flag anything doesn't mean the PR is
   actually free of issues. Always apply your own judgment and expertise.
```

<small>By submitting this pull request, I confirm that my contribution
is made under the terms of the [MIT
license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve assertion failure wording

3 participants