-
Notifications
You must be signed in to change notification settings - Fork 578
feat: reusable rpc refs #8105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: reusable rpc refs #8105
Conversation
Mathlib CI status (docs):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach makes sense, but it feels very complicated for what it does. I am wondering about alternatives.
- First of all, I am assuming that there is (still) no way for the server to only send each diagnostic once, when its snapshot first finishes computing? Maybe because the LSP client will clear all other diagnostics upon receiving
publishDiagnostics
, so the Lean server has to keep resending them? - What if instead of implementing this server-side, we tagged each
Lean.Lsp.Diagnostic
with a unique ID, and have the infoview only callgetInteractiveDiagnostics
on new diagnostics that it receives, caching the results for diagnostic IDs it already knows? It seems that would decrease the amount of data transferred between client and server, and also absolve the server from having to cache anything. - Supposing the above questions have definite 'no' answers, I have some concerns about the implementation
WithRpcRef
was intended as a type marker used in client code that derivesRpcEncodable
. There is by now a good amount of widgets that useWithRpcRef.mk
, andprivate
izing it would break that code. Atm I can't think of a usecase forWithRpcRef.mkReusable
outside of this issue that's very specific to diagnostics; I guess widget messages could also usemkReusable
in order to maintain their state while new diagnostics stream in? But this requires changing their implementation, while something like the infoview change I suggest above may get away without necessitating that?- The global
freshWithRpcRefId
tracker, combined with the fact that we now need two address spaces - one for IDs and one forRpcRef
s - seems quite hacky. CouldmkReusable
live inStateM RpcObjectStore
and just bumpnextRef
?
Thanks for the review! :-)
Yes, exactly.
This would help with the trace node issue, but it won't help any other users of the RPC protocol. FWIW, I think that it would also be a bit more complicated than simply only calling
If they do really just use
Pretty much any user of the RPC protocol that wants their state to persist for a bit longer could benefit from this, e.g. when moving the cursor but the context hasn't changed or as new diagnostics are still trickling in. IMO fixing this kind of issue for widgets as well at some point in the future is definitely desirable, even if not quite as urgent.
I think it could, but it would complicate client code even more, no? RPC procedures would then need access to the session ID so that they can obtain the RPC session and its |
Oh I forgot that
It's not so strange if the spec for RPC functions is 'allocate stuff and return a pointer'. Then it's the client's responsibility to cache.
Yeah sorry, I meant they use anon constructors.
Yeah this is exactly the problem; for diagnostics the server can cache these |
I think we could, yes.
How would this work for widgets? Say I move the cursor around and the widget shouldn't have to re-render while this is happening.
We could add support for this at a later point. I think a basic implementation could be as simple as extending the cache in the snapshot to include arbitrary |
To summarize the discussion so far, my understanding is that:
Assuming the above is correct, I am not opposed to this change since you make some good points, but I think it might make sense to hold off on it before a concrete example of a widget that would benefit from this comes up, and in the meantime just fix this in |
Just to summarize what the client-sided fix would look like:
To be honest, all of this doesn't really seem less painful than the changes in this PR. It's also worth pointing out that NVim has the same problem as VS Code here, and would also have to make all of these adjustments. Generally, I think I'd prefer to merge this PR sooner, rather than later. It will only get more difficult to make breaking changes in the RPC protocol as time goes on, and I think our conversation here has established that there are cases where the client can't possibly tell whether its information is stale or not and it must ask the server, so we might need to make this change eventually regardless. |
Making this change in some clients wouldn't make other clients any more broken than they already are, right?
I think you are right about both of these, but if the two solutions really are roughly equivalent in complexity, I would still advocate for the mostly-client-based one. It modifies only implementation details; the present solution changes the APIs that Lean users work with without being demonstrably more useful for those users: we do not have any examples of widgets that would make use of reusable RPC refs. I would feel more positive about this if there were such an example. What I am specifically concerned about is that we make this breaking change in order to fix the diagnostics resetting, then realize that it is not actually useful for other widgets, and have to make more breaking changes or hack around the design. |
The surface area of the breaking change in this PR is very small. All it does is make a constructor It's perhaps also worth pointing out again that the alternative is not actually a client-only fix. We need to adjust the protocol of
It's not difficult to imagine how most widgets with any form of UI state might benefit from them: They allow widgets to retain state when the text cursor is moved in the scope where the same widget is triggered. In general, I feel that this discussion about using diagnostic IDs vs. using RPC refs is a bit of a red herring: There's good (orthogonal) reasons to have both, because clients generally can't decide whether to reuse state on their own, except in special cases.
To me, the main concern for the alternative proposal is that clients would have to implement several heuristics to request data from the language server less often so that their behavior is more functionally correct. That just doesn't seem like a great API to me - for RPC, the server should have some tool available to tell clients that they may reuse state. |
Ok, let's zoom out and analyze what is going on here. Why do traces collapse when the cursor moves? Because the infoview makes a new request, the server replies with a new RPC reference, and the new reference causes the UI to reset. But why does the UI reset upon receiving a new reference? Because the entire infoview codebase assumes that object identity (in Lean, With the above in mind, what we seem to have disagreed on so far is whether it should be the JS client's or the Lean server's obligation to ensure that object identity for Lean objects implies RPC reference identity (the converse is true because we always use fresh reference IDs). When put like this, it is obvious that the Lean side should do this, as only it knows the underlying Lean objects. (So I agree with you 🙂) While I think we agree on the above, I had concerns about the implementation. I am now starting to think that what's at fault is the decision to store the RPC pointer in Let's go back to the identity implications. Is it ever useful for JS to receive different RPC references to the same Lean object across multiple RPC calls? I claim the answer is no (at least, I cannot think of any reasonable code that would rely on this). So what we can do is make all RPC references reusable, but in the sense that their Lean pointer identity implies RPC reference identity. I think this could be done by adjusting This means the Lean code that produces (A fourth proposal would be for the server to cache references based on Lean object identity as determined by |
Here is a quick prototype. It seems to fix the trace issue, but I didn't test it very much so there may certainly be something very wrong. Since we now create a unique RPC ref for each Lean ptr, I cut out the middleman and just made the refs take on Lean pointer values (which was also how the very first RPC prototype worked, if you remember). I think the comment about 'avoiding any possible bugs resulting from RPC ref reuse' was confused; any such bug is a real bug in the client that should be fixed. Unfortunately it looks like the |
Perhaps I should have mentioned this in the PR header: I had more or less this exact design you propose at some point before submitting this PR for similar reasons and rejected it before arriving at the current approach. The main problems are that there is (AFAICT) no way to state in Lean code that two equal values should have different references (e.g. if CSE gets in the way) if you ever had to and that
I'm not sure about this, but I don't have any concrete counter-examples either. With the pointer-based approach, you could definitely even issue the same RPC reference twice in the same response by accident, which may or may not complicate things on the client-side (e.g. if the client-side were to assume that the UI state of two separate occurrences of the same RPC reference can be reused). I'd prefer to err on the side of caution here.
I don't think that the transparency is an advantage here. In general, when I use Lean, I don't ever need to worry about whether a value I created is referentially identical to another one or just structurally identical. I can cache the value, or I can recreate it with the same value - it won't make a functional difference - and I can always simply look at the definition of the value to figure out how to work with it. When using the pointer of a value to decide which RPC reference it should map to, this is no longer true: In order to reuse an RPC reference, I must now make sure that the value is memorized (not "cached", since it affects the functional behavior), and that I don't recreate the same value somewhere by accident. |
Oh no! I was not aware of this. FWIW these counterexamples shouldn't matter in practice - they are erased and don't affect runtime behaviour by definition - but..
..I see your point here. Reasoning about pointer equality is unavoidable in JS because that's how React compares deps, but at least your design makes the identification of references explicit and easy to manipulate on the Lean side.
They are not entirely orthogonal: they do different things, but overlap in solving #8053, which is what I understood this PR to be mostly aimed at. Having thought about it more, there are two designs I would be happy with. Below, I respond to some of your concerns about either.
I am still partial to just doing 1., but I would be happy with the proposed variant of 2 (or both, if 1. would be helpful for editor decorations and other things). Please let me know what you think. |
3d1277b
to
d130b10
Compare
Let's go with the second approach then - keeping the API simpler in return for breaking a bit more code now is a reasonable compromise. I definitely also agree that we may want to have both eventually (as well as additional heuristics in vscode-lean4 to request information less often) when it becomes more urgent. All of your other points are good as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for bearing with me and this long discussion! Just one remaining question and some docstring suggestions.
-/ | ||
refsById : PersistentHashMap USize Lsp.RpcRef := {} | ||
/-- | ||
Value to use for the next fresh `RpcRef`. It is monotonically increasing to avoid any possible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Value to use for the next fresh `RpcRef`. It is monotonically increasing to avoid any possible | |
Value to use for the next fresh `RpcRef`, monotonically increasing. |
Per discussion, I think my comment was misguided.
mapped to by their RPC reference. | ||
/-- | ||
Marks values to be encoded as opaque references in RPC packets. | ||
Two `WithRpcRef`s with the same `id` will yield the same `RpcRef`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two `WithRpcRef`s with the same `id` will yield the same `RpcRef`. | |
Two `WithRpcRef`s with the same `id` will yield the same client-side reference. |
Maybe we can treat RpcRef
as an impl detail and avoid reference in user-facing docs.
structure WithRpcRef (α : Type u) where | ||
private mk' :: | ||
val : α | ||
id : USize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id : USize | |
private id : USize |
What do you think about making the projection private as well? It feels like other modules should not be able to observe this.
Two `WithRpcRef`s with the same `id` will yield the same `RpcRef`. | ||
Hence, storing a `WithRpcRef` produced by `WithRpcRef.mk` and serving it to the client twice will | ||
also yield the same `RpcRef` twice, allowing clients to reuse their associated UI state across | ||
RPC requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two `WithRpcRef`s with the same `id` will yield the same `RpcRef`. | |
Hence, storing a `WithRpcRef` produced by `WithRpcRef.mk` and serving it to the client twice will | |
also yield the same `RpcRef` twice, allowing clients to reuse their associated UI state across | |
RPC requests. | |
As long as the client holds at least one reference to this `WithRpcRef`, | |
serving it again will yield the same client-side reference. | |
Thus, when used as React deps, | |
client-side references can help preserve UI state across RPC requests. |
I tried to clarify that at least RC=1 on the server is necessary to actually serve the same RpcRef
, and suggested using as React dep explicitly.
This PR adds support for server-sided
RpcRef
reuse and fixes a bug where trace nodes in the InfoView would close while the file was still being processed.The core of the trace node issue is that the server always serves new RPC references in every single response to the client, which means that the client is forced to reset its UI state.
In a previous attempt at fixing this (#8056), the server would memorize the RPC-encoded JSON of interactive diagnostics (which includes RPC references) and serve it for as long as it could reuse the snapshot containing the diagnostics, so that RPC references are reused. The problem with this was that the client then had multiple finalizers registered for the same RPC reference (one for every reused RPC reference that was served), and once the first reference was garbage-collected, all other reused references would point into the void.
This PR takes a different approach to resolve the issue: The meaning of
$/lean/rpc/release
is relaxed from "Free the object pointed to by this RPC reference" to "Decrement the RPC reference count of the object pointed to by this RPC reference", and the server now maintains a reference count to track how often a givenRpcRef
was served. Only when every single served instance of theRpcRef
has been released, the object is freed. Additionally, the reuse mechanism is generalized from being only supported for interactive diagnostics, to being supported for any object usingWithRpcRef
. In order to make use of reusable RPC references, downstream users still need to memorize theWithRpcRef
instances accordingly.Closes #8053. This PR should only be merged at the start of the next release cycle.
Breaking changes
Since
WithRpcRef
is now capable of tracking its identity to decide whichWithRpcRef
usage constitutes a reuse, the constructor ofWithRpcRef
has been madeprivate
to discourage downstream users from creatingWithRpcRef
instances with manually-setid
s. Instead,WithRpcRef.mk
(which lives inBaseIO
) is now the preferred way to createWithRpcRef
instances.