Experimental Solidity codegen for types of different stack sizes.

The IR code generation in ``libsolidity/experimental/codegen`` of https://github.com/ethereum/solidity/pull/14510 is quite incomplete so far. This issue explains the next step of extending it.

Currently, the generation assumes that every type fits into exactly one stack slot (thereby all types can be treated the same), while in reality, e.g. ``unit`` types don't require any stack slot, ``void`` types cannot have a representation on stack at all, ``pair`` (resp. in general tuple)-types may require multiple stack slots (similar for ``sum`` types eventually).

In the long-term it would be nice to be able to define the stack representation of a type in-language by instantiating special type classes - and code generation would use compile-time expression evaluation to determine the stack size of a type. However, as a first step we want to do this in a hard-coded manner.

This means that we need to associate primitive types (note that due to those changing, this work should be based on https://github.com/ethereum/solidity/pull/14566) with stack sizes: 0 for ``unit``, 1 for ``word`` (and function types, even though we won't properly handle them for now), ``1`` for ``bool``, none for ``void`` and ``integer``. For ``pair`` types we sum up the sizes of their type arguments. For user-defined types, we take the stack size of the underlying type.
This sounds easy in theory, but will probably take a bit of doing:

Note that we can only tell the stack size of fully monomorphic types and we only monomorphize during code generation, so all of this needs to happen during code-generation. Code-gen already involves monomorphization for example in that ``IRGenerator::generate(FunctionDefinition const& _function, Type _type)`` already gets a concrete type and it stores the correct type environment in the context - relative to that type environment we will always get fully monomorphic types for which we can determine the stack size. Moving from user-defined types to the underlying types may still involve local type environments and unification to construct the correct argument types for the underlying type.

On the codegen-side we will need a mechanism similar to https://github.com/ethereum/solidity/blob/develop/libsolidity/codegen/ir/IRVariable.h - i.e. instead of generating code for expression directly as single Yul variables, we'll want to abstract them into ``IRVariables`` that may extend to multiple stack slots. However, we won't need any complex notion of conversions on ``IRVariables`` since non-trivial conversions (i.e. conversions other than ``abs`` and ``rep`` for user-defined types that will be no-ops for code generation) will be defined in-language, so code generation itself won't need to deal with it. We will still need a (simpler, since without non-trivial conversions) equivalent of ``IRGeneratorForStatements::declare`` and ``IRGeneratorForStatements::assign`` from https://github.com/ethereum/solidity/blob/develop/libsolidity/codegen/ir/IRGeneratorForStatements.cpp, but without the conversion logic (the main complication will be to turn assignments of multiple variables into multiple assignments, since Yul doesn't allow multi-assignments - but if need be we can use identity functions to-be-inlined for that. I.e. ``let x,y := z, w`` is invalid in Yul, so we either need to split into ``let x := z let y := w`` or turn it into ``let x,y := identity_2(z,w)`` with ``function identity_2(a,b) -> r,s { r := a s := b }``)

So the main thing to do is to replace instances of declarations like
```
		m_code << "let " << IRNames::localVariable(_identifier) << ...;
```
with declarations of ``IRVariables`` of the proper type (which will resolve to a multi-variable declaration on the yul level) - and similarly references to and assignments to expressions of a given type.

``rep`` and ``abs`` can still remain no-ops (but should assert equal stack sizes of argument and return type).

After the above is done in a subsequent step, we need to build proper code generation for the ``pair.first`` and ``pair.second`` functions - and then build proper pattern-matching destructuring on the parsing/inference side and the code generation-side (i.e. ``let (a,b) = (c, d);``, etc.), but this will go hand-in-hand with generally defining proper type constructors and algebraic data types in language, so out of scope for this issue (first step will be abstracting ``IRVariables`` and make sure things work for single-stack-slot types - once that works, we can experiment with ``pair.first`` and ``pair.second`` on tuples).


So to be clear, the first task here merely involves:
- Determine the stack sizes of types (primitive and defined).
- Build an ``IRVariable``-mechanism to seamlessly handle multi-variable declarations and multi-assignments in place of the current assumption that everything is one stack slot.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Experimental Solidity codegen for types of different stack sizes. #14569

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Experimental Solidity codegen for types of different stack sizes. #14569

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions