[v2] Compute linearised members of all contracts in a new semantic pass by ggiraldez · Pull Request #1805 · NomicFoundation/slang

ggiraldez · 2026-05-28T22:51:18Z

This PR adds a new semantic pass p5_compute_linearisations to compute linearised collections of all contract members: functions, state variables, errors and events. This will be the ideal place to perform various validations: check for redefinition of identifiers, check virtual and override attributes, etc.

As a by-product, the information is collected and saved in the SemanticContext for later access from the AST API.

This PR adds some new TODO(validation) comments that will be addressed in a later PR.

⚠️ Breaks API so will need a migration PR for solx.

changeset-bot · 2026-05-28T22:51:22Z

⚠️ No Changeset found

Latest commit: a798fad

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

teofr

Early pass and comments, should we add the ci:perf label?

teofr · 2026-06-03T15:03:52Z

+/// Walks the linearised bases in reverse (most-base first) and concatenates
+/// every contract's state-variable members in source order. Interfaces don't
+/// contribute state variables in Solidity.
+fn collect_linearised_state_variables(
+    binder: &Binder,
+    contract_id: NodeId,
+) -> Vec<ir::StateVariableDefinition> {


We'll probably need to validate earlier whether state variables have the same name, but a comment or maybe even a debug_assert could help.

I don't think we can make any guarantees at this point. Do you propose we skip the computation and return an empty vector if there are duplicates? This is a similar situation to #1806 (comment). Since we don't provide any guarantees when the user input is not valid through the type system, I don't see what else we should do here.

Maybe we should discuss blocking the AST if there are any diagnostics emitted?

I meant more a note on behaviour over repeated state variables, right now they're repeated.

A lot of validations (like this one) could happen while constructing the cache, and otherwise don't have a clear place in the code right now. I'm thinking maybe it makes sense to formalize this as a pass p5_compute_linearisations. We generate ContractDataCache as a by-product of executing the pass, but we also perform all validations related to inheritance, overriding, etc.

A lot of validations (like this one) could happen while constructing the cache, and otherwise don't have a clear place in the code right now. I'm thinking maybe it makes sense to formalize this as a pass p5_compute_linearisations. We generate ContractDataCache as a by-product of executing the pass, but we also perform all validations related to inheritance, overriding, etc.

I went ahead and refactored the code into a new semantic pass. I also added some new TODO(validation) comments that I'll start addressing in separate PRs, but I think this is the ideal place to run those validations.

teofr · 2026-06-03T15:15:17Z

+    linearised_state_variables: Vec<ir::StateVariableDefinition>,
+    linearised_errors: Vec<ir::ErrorDefinition>,
+    linearised_events: Vec<ir::EventDefinition>,


I have my doubts on whether these ones should be cached (state variables, errors, and events), generating the data is linear, so we could easily return an iterator over the bases's members instead.

Fair point.

Any thoughts on this?

I hadn't tackled this comment yet, but after thinking about duplicates of state variables we will have the same issue here. Checking for duplicate declarations in an inheritance tree needs to happen both for errors and events, and not just if the user decides to get the linearisations.

Sorry, I thought it was ready for a re-review before.

I guess it's two separate questions, the validation pass should be done, I agree with that. But is it worth it to cache these values? Or could they be calculated on demand (without performing a second validation).

The fact that this PR barely moved the needle on used memory on the benchmarks makes me think there's actually not that much at stake here (ie chains are very short), but maybe I'm missing something on the expensive calculations (ie function linearisation).

Another question to be asked is, how many times will user use these linearised vectors.

The vectors should be very small in comparison to the IR, that's probably why it doesn't move the needle in the perf benchmarks.

As to how many times they will be used, I don't know for sure, but for solx at least once for functions, to codegen each function. For state variables I don't think they would need it directly, but we use that for computing the storage layout, which they do consume. Errors and events are probably not needed. Then again, the memory usage for caching them should be negligible.

github-actions · 2026-06-05T22:06:05Z

Bencher Report

Branch	ggiraldez/v2-cache-linearisations
Testbed	ci

⚠️ WARNING: Truncated view!
The full continuous benchmarking report exceeds the maximum length allowed on this platform.

⚠️ WARNING: No Threshold found!
Without a Threshold, no Alerts will ever be generated.

🚨 5 Alerts

🐰 View full continuous benchmarking report in Bencher

teofr

LGTM, the only concern is whether we need the cache for the linear time data as well.

teofr · 2026-06-08T16:43:13Z

+    linearised_state_variables: Vec<ir::StateVariableDefinition>,
+    linearised_errors: Vec<ir::ErrorDefinition>,
+    linearised_events: Vec<ir::EventDefinition>,


Any thoughts on this?

…ator

Add validation TODO comments there, to be implemented later. Rename data structures to better reflect their new role.

teofr

It looks good, I really like the new pass, and you're right, it's a natural place for a lot of validations.

teofr · 2026-06-09T09:10:44Z

@@ -75,13 +78,15 @@ impl SemanticContext {
        p2_linearise_contracts::run(files, &mut binder, diagnostics);
        p3_type_definitions::run(files, &mut binder, &mut types, language_version);
        p4_resolve_references::run(files, &mut binder, &mut types, language_version);


p4 uses linearisations to resolve references, have you considered sharing some of the caching to improve performance on it?

I guess the more general question is, does p5 depend on p4?

p4 uses linearisations to resolve references, have you considered sharing some of the caching to improve performance on it?

p4 needs the linearisation of contracts, but the lookups happen in the scopes. We might be able to refactor the code to use cached linearisations, but that's probably a bigger change.

I guess the more general question is, does p5 depend on p4?

I don't think it should, because p4_resolve_references resolves expressions/statements identifiers. There are identifiers in type definitions, but those are resolved in p3_type_definitions. So, the input for p5_compute_linearisations is complete by the end of p3.

I'll verify that assertion and reorder the passes.

Indeed, linearisations can be computed right after p3 and the result is the exact same as computing them at the end. So, for clarity I reordered the passes and put p4_compute_linearisations and then p5_resolve_references.

In the future, it may be possible to use the cached linearisations for resolution as well.

teofr · 2026-06-09T09:50:51Z

+    types: &TypeRegistry,
+    contract_id: NodeId,
+) -> ContractLinearisations {
+    let functions = compute_linearised_functions(binder, types, contract_id);


nit: should they follow the same naming?

Suggested change

let functions = compute_linearised_functions(binder, types, contract_id);

let functions = collect_linearised_functions(binder, types, contract_id);

teofr · 2026-06-09T09:55:19Z

+
+/// Cache of derived data about contracts stored on the `SemanticContext`. Every
+/// contract's `NodeId` has an entry in `data`.
+pub(crate) struct ContractData {


Doesn't need to be on this PR, but it'd be interested to have a benchmark tracking how these values are used. For example, for some big benchmarks, iterate all linearised definitions for all contracts and use them trivially (compute the hash of their selectors)

Compute linearisations before resolving references

ggiraldez requested review from OmarTawfik and teofr May 28, 2026 22:51

ggiraldez requested review from a team as code owners May 28, 2026 22:51

ggiraldez mentioned this pull request Jun 1, 2026

[v2] Cache type derived data #1816

Open

teofr reviewed Jun 3, 2026

View reviewed changes

ggiraldez force-pushed the ggiraldez/v2-cache-linearisations branch from 4bd5b30 to 39dd3d5 Compare June 5, 2026 21:51

teofr approved these changes Jun 8, 2026

View reviewed changes

ggiraldez added 4 commits June 8, 2026 17:53

Cache linearised collection of contract's function, variables, etc

3fcd91f

Address PR comments; change find_contract_by_name to return an iter…

78fc92d

…ator

Update public-api.txt snapshots

6e4d156

Build ContractData in a new semantic pass

4a41621

Add validation TODO comments there, to be implemented later. Rename data structures to better reflect their new role.

ggiraldez force-pushed the ggiraldez/v2-cache-linearisations branch from 1d8cf1c to 4a41621 Compare June 8, 2026 21:54

ggiraldez changed the title ~~[v2] Cache linearised collection of contract's functions, variables, errors and events~~ [v2] Compute linearised members of all contracts in a new semantic pass Jun 8, 2026

ggiraldez requested a review from teofr June 8, 2026 22:00

teofr approved these changes Jun 9, 2026

View reviewed changes

Reorder semantic passes

a798fad

Compute linearisations before resolving references

	let functions = compute_linearised_functions(binder, types, contract_id);
	let functions = collect_linearised_functions(binder, types, contract_id);

Conversation

ggiraldez commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

teofr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggiraldez Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggiraldez Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ WARNING: Truncated view!

⚠️ WARNING: No Threshold found!

🚨 5 Alerts

Uh oh!

teofr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

teofr left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggiraldez commented May 28, 2026 •

edited

Loading

changeset-bot Bot commented May 28, 2026 •

edited

Loading

ggiraldez Jun 8, 2026 •

edited

Loading

ggiraldez Jun 9, 2026 •

edited

Loading

github-actions Bot commented Jun 5, 2026 •

edited

Loading