Skip to content

[v2] Cache type derived data#1816

Open
ggiraldez wants to merge 2 commits into
ggiraldez/v2-cache-linearisationsfrom
ggiraldez/v2-cache-type-data
Open

[v2] Cache type derived data#1816
ggiraldez wants to merge 2 commits into
ggiraldez/v2-cache-linearisationsfrom
ggiraldez/v2-cache-type-data

Conversation

@ggiraldez

Copy link
Copy Markdown
Contributor

Builds on top of #1805

Cache all types's internal and canonical names, as well as size in storage.

This info (the internal name in particular) will be needed for diagnostics related to typing, but we can't fully compute the type data until all types have been instantiated at the end of p4. I see three alternatives:

  1. do typing diagnostics after p4, and after computing the type data
  2. pre-populate the type data cache on demand on p4 so that diagnostics can use it
  3. duplicate functions to compute type internal names without caching so they can be used for diagnostics only

@ggiraldez ggiraldez requested review from OmarTawfik and teofr June 1, 2026 18:53
@ggiraldez ggiraldez requested review from a team as code owners June 1, 2026 18:53
@changeset-bot

changeset-bot Bot commented Jun 1, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 7a92137

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@@ -0,0 +1,388 @@
use std::collections::HashMap;

@OmarTawfik OmarTawfik Jun 4, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate functions to compute type internal names without caching so they can be used for diagnostics only

IIUC, this cache is only useful for diagnostics, and it will never be used in the happy path scenario. So, we would be paying the cost of populating it all the time, but rarely actually using it.

Given that our priority is the happy path perf, I wonder if we should hold off on such optimizations for now, until we get real-world perf numbers (e.g. IDE scenarios) to prove it is worth paying the cost upfront.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also used for ABI related information you can get from the AST, and solx is consuming some of it (eg. here and here). However, I agree having this cache is probably not helping performance in most cases (it might if the user code has lots of nested structs or some other edge cases).

I do like the idea of having the computation functions moved and centralized in the semantic crate instead of having some of them there and the rest in ast. For diagnostics we'll probably only need the internal name of types, but agreed that they will not be in the happy path.

Happy to close this (and #1817), or change them to only move the functionality but not construct the caches eagerly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the refactoring if you think it is worth doing.
My concern is mainly about adding the cache without having the numbers to verify its impact.

@ggiraldez ggiraldez force-pushed the ggiraldez/v2-cache-linearisations branch 2 times, most recently from 1d8cf1c to 4a41621 Compare June 8, 2026 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants