Skip to content

Experimental Solidity codegen for types of different stack sizes. #14569

Open
@ekpyron

Description

@ekpyron

The IR code generation in libsolidity/experimental/codegen of #14510 is quite incomplete so far. This issue explains the next step of extending it.

Currently, the generation assumes that every type fits into exactly one stack slot (thereby all types can be treated the same), while in reality, e.g. unit types don't require any stack slot, void types cannot have a representation on stack at all, pair (resp. in general tuple)-types may require multiple stack slots (similar for sum types eventually).

In the long-term it would be nice to be able to define the stack representation of a type in-language by instantiating special type classes - and code generation would use compile-time expression evaluation to determine the stack size of a type. However, as a first step we want to do this in a hard-coded manner.

This means that we need to associate primitive types (note that due to those changing, this work should be based on #14566) with stack sizes: 0 for unit, 1 for word (and function types, even though we won't properly handle them for now), 1 for bool, none for void and integer. For pair types we sum up the sizes of their type arguments. For user-defined types, we take the stack size of the underlying type.
This sounds easy in theory, but will probably take a bit of doing:

Note that we can only tell the stack size of fully monomorphic types and we only monomorphize during code generation, so all of this needs to happen during code-generation. Code-gen already involves monomorphization for example in that IRGenerator::generate(FunctionDefinition const& _function, Type _type) already gets a concrete type and it stores the correct type environment in the context - relative to that type environment we will always get fully monomorphic types for which we can determine the stack size. Moving from user-defined types to the underlying types may still involve local type environments and unification to construct the correct argument types for the underlying type.

On the codegen-side we will need a mechanism similar to https://github.com/ethereum/solidity/blob/develop/libsolidity/codegen/ir/IRVariable.h - i.e. instead of generating code for expression directly as single Yul variables, we'll want to abstract them into IRVariables that may extend to multiple stack slots. However, we won't need any complex notion of conversions on IRVariables since non-trivial conversions (i.e. conversions other than abs and rep for user-defined types that will be no-ops for code generation) will be defined in-language, so code generation itself won't need to deal with it. We will still need a (simpler, since without non-trivial conversions) equivalent of IRGeneratorForStatements::declare and IRGeneratorForStatements::assign from https://github.com/ethereum/solidity/blob/develop/libsolidity/codegen/ir/IRGeneratorForStatements.cpp, but without the conversion logic (the main complication will be to turn assignments of multiple variables into multiple assignments, since Yul doesn't allow multi-assignments - but if need be we can use identity functions to-be-inlined for that. I.e. let x,y := z, w is invalid in Yul, so we either need to split into let x := z let y := w or turn it into let x,y := identity_2(z,w) with function identity_2(a,b) -> r,s { r := a s := b })

So the main thing to do is to replace instances of declarations like

		m_code << "let " << IRNames::localVariable(_identifier) << ...;

with declarations of IRVariables of the proper type (which will resolve to a multi-variable declaration on the yul level) - and similarly references to and assignments to expressions of a given type.

rep and abs can still remain no-ops (but should assert equal stack sizes of argument and return type).

After the above is done in a subsequent step, we need to build proper code generation for the pair.first and pair.second functions - and then build proper pattern-matching destructuring on the parsing/inference side and the code generation-side (i.e. let (a,b) = (c, d);, etc.), but this will go hand-in-hand with generally defining proper type constructors and algebraic data types in language, so out of scope for this issue (first step will be abstracting IRVariables and make sure things work for single-stack-slot types - once that works, we can experiment with pair.first and pair.second on tuples).

So to be clear, the first task here merely involves:

  • Determine the stack sizes of types (primitive and defined).
  • Build an IRVariable-mechanism to seamlessly handle multi-variable declarations and multi-assignments in place of the current assumption that everything is one stack slot.

Metadata

Metadata

Assignees

Labels

experimentalhigh effortA lot to implement but still doable by a single person. The task is large or difficult.high impactChanges are very prominent and affect users or the project in a major way.must have eventuallySomething we consider essential but not enough to prevent us from releasing Solidity 1.0 without it.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions