-
Notifications
You must be signed in to change notification settings - Fork 285
Fix: Wording of assertion failure closer to semantics #3324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR fixes #3216 It completely replaces the wording ".... might not hold" by "could not prove..."
|
|
||
| public override string FailureDescription => | ||
| customErrMsg ?? "assertion might not hold"; | ||
| customErrMsg ?? "Could not prove assertion"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| customErrMsg ?? "Could not prove assertion"; | |
| customErrMsg ?? "could not prove assertion"; |
The convention in the rest of the messages is for them to start with a lower-case letter.
|
|
||
| public override string FailureDescription => | ||
| customErrMsg ?? "function precondition might not hold"; | ||
| customErrMsg ?? "Could not prove function precondition"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| customErrMsg ?? "Could not prove function precondition"; | |
| customErrMsg ?? "could not prove function precondition"; |
The convention in the rest of the messages is for them to start with a lower-case letter.
docs/DafnyRef/UserGuide.md
Outdated
| ### 25.6.1. Verification debugging when verification fails {#sec-verification-debugging} | ||
|
|
||
| Let's assume one assertion is failing ("assertion might not hold" or "postcondition might not hold"). What should you do next? | ||
| Let's assume one assertion is failing ("Could not prove assertion" or "Could not prove postcondition"). What should you do next? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same case distinction here.
docs/HowToFAQ/ERROR_SeqComp.md
Outdated
| @@ -1,5 +1,5 @@ | |||
| --- | |||
| title: "Error: function precondition might not hold" | |||
| title: "Error: Cannot prove function precondition" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
docs/HowToFAQ/onepage.md
Outdated
| Type parameter characteristics are discussed in [the reference manual](../DafnyRef/DafnyRef.html#sec-type-parameter-variance) | ||
|
|
||
| # "Error: function precondition might not hold" | ||
| # "Error: Cannot prove function precondition" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
|
|
||
| Bpl.Ensures ens = new Bpl.Ensures(ForceCheckToken.Unwrap(tok), free, condition, comment); | ||
| ens.Description = new PODesc.AssertStatement(errorMessage ?? "This is the postcondition that could not be proven."); | ||
| ens.Description = new PODesc.AssertStatement(errorMessage ?? "this is the postcondition that could not be proven."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are going with the lower case style, then remove the ending periods as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is usually in a full sentence, though, of the form "Error: here is something in lowercase."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but the style in dafny elsewhere is predominantly to start with lower-case and not have a period, except when there are clearly multiple sentences. I'm not wedded to this style, and am open to a team discussion, but I'd like it to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good point. I'm fine with leaving out the periods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
This PR is the companion of dafny-lang/dafny#3324 We established numerous times this year that the wording "might not hold" still incorrectly provides the feeling that the postcondition is incorrect. Rather than focusing on whether assertions hold or not, this PR fixes the status by focusing on the fact that the verifier was not able to prove the assertion.
|
Requires |
* Fix: Error wording closer to semantics This PR is the companion of dafny-lang/dafny#3324 We established numerous times this year that the wording "might not hold" still incorrectly provides the feeling that the postcondition is incorrect. Rather than focusing on whether assertions hold or not, this PR fixes the status by focusing on the fact that the verifier was not able to prove the assertion. * Fixed tests * proven => proved. Fixed tests * Fixed one more en->ed * Fixed a test * Fixed all the boogie files with messages inline * Fixed 3 more tests * Fixed last 2 CI errors * Support for the last two CI tests * Wording for invariants as well
- Added the trace of proof even in successes
- Replaced intermediate "Could not prove" and "Did prove" by "Inside " to indicate this is just a trace
- Use backticks to indicate quoted code
- Better error message instead of "error is impossible: This is the precondition that might not hold"
- First-class support of {:error} in hover messages.
…obligation descriptions.
# Conflicts: # Source/DafnyLanguageServer.Test/Lookup/HoverVerificationTest.cs
…t text Fixed CI tests
Found and fixed missed instances of 'proven' that should be 'proved' for consistency with Boogie terminology: - QuantificationNewSyntax.dfy: function precondition comment - revealFunctions.dfy: ensures clause comment - Updated both source and bin directory versions This completes the terminology standardization effort.
Updated documentation to use 'proved' instead of 'proven' to match the standardized terminology used throughout the codebase and align with Boogie's terminology.
The error message standardization is a breaking change that may affect external tools parsing Dafny output. Updated the news entry to clearly communicate this to users and tool maintainers.
- Fix capitalization: 'dafny' → 'Dafny' in UserGuide.md - Update ERROR_SeqComp documentation to use consistent 'proved' terminology - Restore accidentally modified symlinks to their original state - Ensure documentation titles match actual error message content These changes address out-of-scope modifications and terminology inconsistencies flagged by the automated review process.
Empty log files should not be committed to version control as they serve no purpose and clutter the repository. This file was accidentally included and is unrelated to the error message standardization effort.
- Replace confusing double-negative logic with clearer positive framing - Improve grammatical structure for better readability - Maintain consistency with PR's standardization goals
- Keep original error message wording that Dafny actually produces - Use proper Dafny terminology (fields, not elements)
- Update error message expectations to use 'could not be proved' wording - Fix documentation typo: 'proven' -> 'proved' - Align with PR #3324 error message standardization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your review.sh script from #6318 was very helpful!
It pointed out one inconsistency in the error messages, exhibited by modified.dfy, which I commented on below.
It also pointed out some slightly out-of-scope changes related to command-line handling in docs/DafnyRef/UserGuide.md, but I think they all seem reasonable and I don't object to including them.
| modifiable.dfy(42,6): Error: modified field could not be proved to be in the current modifies clause | ||
| Asserted expression: c == s[1] | ||
| modifiable.dfy(50,5): Error: assignment might update an array element not in the enclosing context's modifies clause | ||
| modifiable.dfy(50,5): Error: modified field could not be proved to be in the current modifies clause |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new message talks about a field, but the location being updated is an array element, as the old description correctly stated.
- Change 'modified field' to use the actual description ('an object' vs 'an array element')
- Update modifiable.dfy.expect to reflect correct error messages
- Addresses review comment about incorrect 'field' terminology for array elements
- Field assignments: 'modified field could not be proved...' - Array assignments: 'modified array location could not be proved...' - Fixed at construction sites to avoid switch statement repetition - Updated modifiable.dfy.expect accordingly
- Array assignments now show 'modified array location' - Field assignments continue to show 'modified field' - Updated Array.dfy.expect, LoopModifies.dfy.expect, TraitUsingParentMembers.dfy.expect
- Array assignment a[1] := 3 should show 'modified array location'
- Field assignments in forall: 'field' -> 'modified field could not be proved...' - Array assignments in forall: 'array location' -> 'modified array location could not be proved...' - This fixes the remaining CI failures in integration test partitions 1 and 3
- Trigger CI to test final fixes - 18/20 tests now passing (90% success rate) - Only integration-tests partition 3 still failing
- This file doesn't belong in the repository
- Line 332 is 'x :| assume x == 10;' where x is a class field - Should be 'modified field' not 'modified array location' - This fixes the last failing integration test
Currently, reviewing long PRs, like PRs that changes multiple expect files, is discouraging. This PR automates the reviewing process by delegating some portion of reviewing, i.e. low-level reviewing, to an agent such as Amazon Q or Claude Code ### What was changed? I added a script `./review.sh` that requires a PR number and automatically performs a review of every change to see if they conform to the PR description. It clones the repository, lists all the files, and input maximum 500 lines of diff into a prompt that asks if the diff is consistent with what we could have expected from the PR description. Sample output while the script is running live. ``` ========================================= Current review concerns: (No issues found so far) Progress: 21/346 files reviewed so far Currently reviewing: docs/DafnyRef/UserGuide.5.expect ========================================= ``` ### How has this been tested? #### Exiting PR I ran this script on #3324 to ensure the PR description is consistent with the changes. It already detected the following issues: - The PR description omitted to mention that the error messages could break integration with tools depending on the error messages - Two files were added that shouldn't have been committed - One document line was not up to date. - Capitalization of Dafny was inconsistent in the doc. #### Self-test: Positive case I also tested this script on this PR itself. Here was the output: ``` ======================================== Current review concerns: (No issues found so far) Progress: 0/1 files reviewed so far Currently reviewing: Scripts/review.sh ========================================= ``` ``` ========================================= 📊 FINAL CODE REVIEW SUMMARY ========================================= Files processed: 1 Files analyzed: 1 Files skipped: 0 Files passed: 1 Files failed: 0 ✅ All analyzed files passed code review! ✅ PR #6318 looks good to merge 🎉 Code review complete! ``` #### Self-test: Negative case: When deleting a line in the Makefile, we immediately get meaningful feedback ``` ========================================= Current review concerns: Makefile: Makefile: DIR variable removed but still referenced in test-dafny and tests-verbose targets, causing undefined variable errors Progress: 1/2 files reviewed so far Currently reviewing: Scripts/review.sh ========================================= ``` and then when the analysis is finished, we see ``` ========================================= 📊 FINAL CODE REVIEW SUMMARY ========================================= Files processed: 2 Files analyzed: 2 Files skipped: 0 Files passed: 1 Files failed: 1 ❌ Issues found in 1 file(s): Makefile: Makefile: DIR variable removed but still referenced in test-dafny and tests-verbose targets, causing undefined variable errors 🧹 Cleaning up successful review files... 💡 Review output files for failed files are available in the current directory ❌ Consider addressing these concerns before merging 🎉 Code review complete!⚠️ IMPORTANT: This AI review is a tool to assist human reviewers. You are ultimately responsible for the actual quality of the review. Just because the AI doesn't flag anything doesn't mean the PR is actually free of issues. Always apply your own judgment and expertise. ``` <small>By submitting this pull request, I confirm that my contribution is made under the terms of the [MIT license](https://github.com/dafny-lang/dafny/blob/master/LICENSE.txt).</small>
This PR fixes #3216 by standardizing Dafny's error messages to use "could not be proved" instead of "might not hold".
Motivation
The current "might not hold" wording has several issues:
The new wording shifts focus from uncertainty about assertions to Dafny's inability to prove them, emphasizing the need to help Dafny with additional proof hints rather than questioning the assertion's validity.
Breaking Change Notice
We consider this breaking change necessary and acceptable because:
External tool maintainers should update their parsing patterns accordingly. See
docs/dev/news/3216.fixfor release note details.Error Message Changes
Core Pattern: "might not hold" → "could not be proved"
Terminology Standardization: "proven" → "proved"
Additional Patterns Standardized
Double Negative Elimination
Capitalization and Punctuation Consistency
Expected Outcome
By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.