Skip to content

Conversation

@muenchnerkindl
Copy link
Contributor

This PR upgrades the Isabelle backend to Isabelle2025 and also fixes all proofs in the standard library of theorems. This also required a subtle change to the SMT backend contributed by @rozlynd.

@muenchnerkindl muenchnerkindl requested a review from kape1395 June 16, 2025 15:13
@kape1395
Copy link
Collaborator

Several proofs fail, at least on my PC. Not sure if that's hardware or OS. I'm using Linux.

@muenchnerkindl
Copy link
Contributor Author

Hmm, how can we debug this? I tried again, and all these proofs are checked on my end (admittedly on a quite fast Mac, but they don't even take long). Does the PM use a bundled version of Z3 or the one found locally, which could introduce another variation point? I'll try to get access to a Linux machine and see what happens.

@ahelwer
Copy link
Collaborator

ahelwer commented Jun 17, 2025

Here's a result from my side, on Linux (old i7-4790k CPU from 2014):

  1. git fetch origin
  2. git checkout isabelle2025
  3. git clean -dfx (deletes all files not checked into git)
  4. make release
  5. ./_build/tlapm/bin/tlapm --cleanfp library/FunctionTheorems_proofs.tla -> success
  6. ./_build/tlapm/bin/tlapm --cleanfp --stretch 2 library/NaturalsInduction_proofs.tla -> failure (1/278 failed)
  7. ./_build/tlapm/bin/tlapm --cleanfp --stretch 2 library/SequencesExtTheorems_proofs.tla -> failure (2/606 failed)

@muenchnerkindl if you would like me to set up a Linux VM in Azure for you I have some credits so can do that. Just send me your public SSH key to put in the ~/.ssh/authorized_keys file.

@lemmy
Copy link
Member

lemmy commented Jun 17, 2025

Would it be possible to make the Github Action the source of truth?

@ahelwer
Copy link
Collaborator

ahelwer commented Jun 17, 2025

I actually don't think the library proofs are checked in the CI, so yeah we could do that. They also fail on the main branch build for me.

@muenchnerkindl
Copy link
Contributor Author

They also fail on the main branch build for me.

Yeah, that's why I (tried to) fix them. ;-)

@ahelwer
Copy link
Collaborator

ahelwer commented Jun 17, 2025

That makes sense! Okay, I'll add the library proofs to the CI and comment out the ones that are failing. Then this PR can un-comment those proofs, and we will see whether they are fixed from the semi-objective perspective of the github CI.

@ahelwer ahelwer mentioned this pull request Jun 17, 2025
@ahelwer
Copy link
Collaborator

ahelwer commented Jun 17, 2025

Okay if you rebase this PR on the latest changes in main, you can get the CI to check specific library proofs by deleting these lines:

-not -name "BagsTheorems_proofs.tla" \
-and -not -name "FiniteSetTheorems_proofs.tla" \
-and -not -name "FunctionTheorems_proofs.tla" \
-and -not -name "NaturalsInduction_proofs.tla" \
-and -not -name "SequenceTheorems_proofs.tla" \
-and -not -name "SequencesExtTheorems_proofs.tla" \
-and -not -name "WellFoundedInduction_proofs.tla" \

@muenchnerkindl
Copy link
Contributor Author

Thanks @ahelwer! So I commented out the line for FunctionTheorems_proofs.tla but I don't see the CI running. Do I have to start it manually for the PR?

@ahelwer
Copy link
Collaborator

ahelwer commented Jun 18, 2025

Ah, I wish github had better reporting for this but you have to check the repository Actions tab to see it:

invalid workflow file: .github/workflows/ci.yml#L63
You have an error in your yaml syntax on line 63

So I think you have to delete the line instead of commenting it out.

@muenchnerkindl
Copy link
Contributor Author

OK, GitHub confirms that the proofs work on MacOS and fail on Linux. I even tried to decompose the failing proof, but still the same result. I'll install a local Linux box and continue testing there.

@kape1395
Copy link
Collaborator

Two steps were failing on my PC in SequencesExtTheorems_proofs.tla. I made a PR #220 fixing them.

Copy link
Collaborator

@kape1395 kape1395 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@muenchnerkindl
Copy link
Contributor Author

Thanks again @kape1395! I am quite disgusted by the fact that the behavior of Z3 differs significantly between OSes. Perhaps we should seriously compare with how cvc5 behaves.

@ahelwer, @lemmy: do you think we should leave the CI as it is, i.e. running all library proofs? They take quite a while on the GitHub servers, and I do not know how problematic this is.

@ahelwer
Copy link
Collaborator

ahelwer commented Jun 21, 2025

The CI now takes 24 minutes instead of 18, which is not too bad. It's important for the library proofs to continue passing, I think? I will move the library checks later in the CI though so basic unit tests run first.

The Z3 version also has not been upgraded in a while, perhaps there are changes to its behavior (for better or worse here, who can say).

@muenchnerkindl muenchnerkindl merged commit e26fb2b into main Jun 23, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

6 participants