Skip to content

Make the LSP independent of level assignment. #186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: updated_enabled_cdot
Choose a base branch
from

Conversation

kape1395
Copy link
Collaborator

The fingerprint calculation reverted to is_const (as it is now done in the main branch) instead of get_level to make the fingerprint calculation independent of the level assignment. This is done to avoid having level assignments on the critical path in the LSP, as it is too slow. In my case, it took 30 seconds to calculate the levels, which blocks subsequent LSP commands.

@ahelwer
Copy link
Collaborator

ahelwer commented Dec 24, 2024

The TLAPM parser has a level-checker?

@kape1395
Copy link
Collaborator Author

It has, but I'm unsure how complete it is. See e.g. https://github.com/tlaplus/tlapm/blob/d8e2a2135063fbb0bd6a4fb3af7bba46951059c8/src/expr/e_levels.ml

@ahelwer
Copy link
Collaborator

ahelwer commented Dec 24, 2024

Would you leave your thoughts on tlaplus/rfcs#16 since you know much more about the TLAPM parser than anyone else?

@kape1395
Copy link
Collaborator Author

Thanks for starting that discussion. I will try to join it, but my knowledge of that parser is limited.

The fingerprint calculation reverted to is_const (as it is now done in the main branch) instead of get_level to make the fingerprint calculation independent of the level assignment. This is done to avoid having level assignments on the critical path in the LSP, as it is too slow. In my case, it took 30 seconds to calculate the levels, which blocks subsequent LSP commands.

Signed-off-by: Karolis Petrauskas <[email protected]>
@kape1395 kape1395 force-pushed the updated_enabled_cdot-fix-lsp-perf branch from d8e2a21 to 0ae62a0 Compare December 30, 2024 13:09
Copy link
Contributor

@damiendoligez damiendoligez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this fix, I think it is the last thing holding up #148. We will revert this change when we figure out what's wrong with level inference.

@kape1395
Copy link
Collaborator Author

Thanks @damiendoligez. I will spend another few evenings on the level checking performance, and if nothing good happens, I will merge this PR.

@ahelwer
Copy link
Collaborator

ahelwer commented May 12, 2025

@kape1395 worth noting the TLA+ Foundation is funding me to switch TLAPM over to the Java parser (SANY) in the near future, which will take care of these level-checking performance issues from the perspective of TLAPM. However, it makes sense to spin out the TLAPM parser into its own repo (it's still quite a good parser and worth keeping for its own sake!) so you can have the choice of continuing to use the TLAPM parser or using SANY. The work will involve translating SANY's parse tree into the parse tree format used by TLAPM so these changes should be transparent to you.

@lemmy
Copy link
Member

lemmy commented May 12, 2025

However, it makes sense to spin out the TLAPM parser into its own repo (it's still quite a good parser and worth keeping for its own sake!) so you can have the choice of continuing to use the TLAPM parser or using SANY. The work will involve translating SANY's parse tree into the parse tree format used by TLAPM so these changes should be transparent to you.

Why is it important to preserve the interface between TLAPM and its legacy parser? What is it buying us?

@ahelwer
Copy link
Collaborator

ahelwer commented May 12, 2025

All the transformations to the backend provers are defined over the existing parse tree format, is my current understanding.

@lemmy
Copy link
Member

lemmy commented May 12, 2025

Understood - there’s no issue with refactoring the parse tree format, but it will require refactoring some of the transformations. To minimize the risk of regressions, such refactoring should be avoided unless justified.

@kape1395
Copy link
Collaborator Author

However, it makes sense to spin out the TLAPM parser into its own repo.

I don't think we need to keep the existing tlapm parser if the new one is integrated and works well.

Where can I put my wishes (would become requirements, if agreed) for the parser integration? Preliminary:

  • Should we add protobuf and/or grpc interface for the parser instead of parsing the XML? For performance reasons. LSP uses JSON-RPC, maybe we should follow that standard?
  • I would like to pass some checksums/modification times for the modules already parsed by the tlapm. This way, it could avoid repeatedly parsing them (the dependencies of a module currently edited). A stateful interaction is probably needed to handle that properly (that would imply grpc or similar, I guess).
  • I would like to have a way to parse a document not saved to the disk, e.g., from the stdin or grpc message payload. This is currently used by the tlapm lsp.

Also, it is unclear to me what happens to the expression levels when definitions in an expression are expanded. Don't we need to recalculate the levels? Does the SANY parser handle that?

@ahelwer
Copy link
Collaborator

ahelwer commented May 13, 2025

Also, it is unclear to me what happens to the expression levels when definitions in an expression are expanded. Don't we need to recalculate the levels? Does the SANY parser handle that?

Do you mean operators? Basically how SANY handles that is if you have an operator like:

prime(x) == x'

then it puts a constraint on the parameter x so that the maximum level of expression you can supply for it is a variable-level expression. Giving an action-level expression for x (which would result in a double-prime) results in error 4205 "operator level constraints exceeded".

There's an even more complicated analysis that needs to be done for higher-order operators, as in operators accepting other operators as parameters. Error 4272 "higher order operator parameter level constraint not met" involves the level checker defining minimum levels for parameters. Here's an input that would trigger it:

VARIABLE v
op(F(_)) == F(v')
op2 == op(')

This one is a bit confusing. Basically when higher-order operators are used, the level-checker needs to determine the max/min levels of arguments they can accept. It does this by looking at how the higher-order operators are used. In this case, F(_) is used with an action-level argument of F(v'). This means any operator provided in place of F when calling op must be able to accept an action-level argument. While we are used to upper bounds when dealing with level-checking, this is a lower bound on an upper bound, which is a bit odd. So when op(') is called, since the ' operator cannot accept an action-level argument, it fails to meet F's lower bound on the argument's upper bound, thus producing a level-checking error. This correctly prevents what would otherwise allow us to double-prime an expression.

You can see a formal spec of the level checking process here.

SANY also runs a complementary analysis trying to ensure formulas are stuttering-insensitive. This is incomplete, as documented in bug tlaplus/tlaplus#1160

@lemmy
Copy link
Member

lemmy commented May 13, 2025

There's an even more complicated analysis that needs to be done for higher-order operators

Nit: TLA+ is a first-order logic. It is not higher-order, where operators can be treated as values. You may pass an operator to another operator, but you e.g. cannot quantify over operators.

@ahelwer
Copy link
Collaborator

ahelwer commented May 13, 2025

@lemmy good point. What do you think is a good term for the parameter of these not-higher-order-ops? I've seen "operator parameter parameter" a few times in the codebase which I don't really like.

@lemmy
Copy link
Member

lemmy commented May 13, 2025

@lemmy good point. What do you think is a good term for the parameter of these not-higher-order-ops? I've seen "operator parameter parameter" a few times in the codebase which I don't really like.

I'm not aware a dedicated terminology.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants