Skip to content

Replace custom Yul semantics with EVMYulLean fork as compilation target #1722

@Th0rgal

Description

@Th0rgal

Goal

Replace Verity's custom Yul execution semantics with a fork of NethermindEth/EVMYulLean (hosted at lfglabs-dev/EVMYulLean), making EVMYulLean the authoritative semantic target for the compilation pipeline. This eliminates the "EVM/Yul Semantics" trusted component from TRUST_ASSUMPTIONS.md and gives Verity's proofs an externally auditable, community-maintained semantic foundation.

Background and Motivation

Today Verity defines its own Yul execution semantics across several files:

Component Verity (current) EVMYulLean
Yul AST Compiler/Yul/Ast.lean — 5 expr constructors, 11 stmt constructors, string-typed builtins EvmYul/Yul/Ast.lean — 3 expr constructors, 9 stmt constructors, typed PrimOp ⊕ YulFunctionName calls
State model ContractState in Verity/Core.lean — flat Lean structure with .storage, .events, .storageArray, .transientStorage, .sender SharedState .Yul + VarStore — full EVM account map, memory, returndata, call stack
Evaluation execYulFuel in Compiler/Proofs/YulGeneration/Semantics.lean — fuel-based, Verity-specific builtins exec/eval in EvmYul/Yul/Interpreter.lean — fuel-based, PrimOp dispatch to shared EVM operations
Builtins ~30 builtins handled by string matching in Builtins.lean All EVM opcodes as Operation .Yul algebraic type, shared with EVM layer
EVM bridge None — Yul is the bottom layer, solc is trusted below Full EVM bytecode interpreter sharing the same Operation and SharedState

The current adapter (EvmYulLeanAdapter.lean) already lowers all 11 Verity statement types and 5 expression types to EVMYulLean AST with 0 unsupported nodes, and 15 pure arithmetic/comparison/bitwise builtins have universal bridge lemmas in EvmYulLeanBridgeLemmas.lean. But the actual execution still runs through Verity's own execYulFuel, not EVMYulLean's interpreter.

What this buys us

  1. Trust reduction: The Yul/EVM semantic model becomes an independently maintained, formally specified artifact rather than a project-internal definition.
  2. EVM conformance: EVMYulLean runs against the official Ethereum test suite (EthereumTests/ submodule, lake test). Verity currently relies on differential testing against Foundry.
  3. Shared community effort: Semantic bugs, hardfork updates, and new opcode support benefit from upstream contributions.
  4. Proof composability: Other projects building on EVMYulLean can consume Verity's proofs, and vice versa.

Current Integration Status

  • AST adapter: Complete — all 11 Verity YulStmt types + 5 YulExpr types lower to EVMYulLean AST (0 gaps per artifacts/evmyullean_unsupported_nodes.json).
  • Pure builtin bridge: 15 universal symbolic lemmas proven in EvmYulLeanBridgeLemmas.lean (add, sub, mul, div, mod, lt, gt, eq, iszero, and, or, xor, not, shl, shr).
  • State-dependent builtins: sload, caller, calldataload, address, timestamp, callvalue, calldatasize, number, chainid, blobbasefee have PrimOp mappings in the adapter but no semantic bridge proofs yet.
  • Verity-specific helpers: mappingSlot has no EVMYulLean counterpart (it composes keccak256 + ABI encoding).
  • Function definitions: The adapter currently inlines function bodies as blocks; EVMYulLean's YulContract.functions integration is not wired up.
  • Memory model: Verity's ContractState has no linear memory; EVMYulLean's SharedState has byte-level memory. This is the biggest structural gap.

Proposed Plan

Phase 1: Fork and adapt EVMYulLean (1–2 weeks)

  1. Fork NethermindEth/EVMYulLean to lfglabs-dev/EVMYulLean.
  2. Pin to the same Lean toolchain as Verity (4.22.0) and ensure lake build succeeds.
  3. Audit the EVMYulLean AST for any missing Yul constructs that Verity needs:
    • Assign statement (EVMYulLean lacks it; current adapter re-encodes as Let) — add if needed for semantic fidelity, or prove equivalence.
    • For initializer block (EVMYulLean's For has no init; adapter pre-emits init statements) — add or prove semantic equivalence of the lowering.
    • Comment / no-op statements (Verity has .comment; EVMYulLean doesn't) — trivially map to Block [].
  4. Add any missing builtins needed by Verity-generated Yul (e.g., tload/tstore for transient storage, blobbasefee, returndatasize, returndatacopy, log0log4 if not already present).
  5. Ensure the fork passes the upstream Ethereum conformance test suite.

Phase 2: State bridge (2–4 weeks)

Build a bidirectional state-reconstruction bridge between Verity's ContractState and EVMYulLean's SharedState .Yul:

  1. ContractState → SharedState: Embed Verity's flat state into EVMYulLean's account-map/memory model. Define toSharedState : ContractState → SharedState .Yul that:
    • Maps storage (slot → value) to the account's storage trie
    • Serializes calldata from the Param model into byte-level memory
    • Maps sender, contractAddress, blockTimestamp, etc. to the ExecutionEnv fields
    • Initializes memory/returndata as empty
  2. SharedState → ContractState: Extract observable state back: fromSharedState : SharedState .Yul → ContractState recovering storage, events (from LOG entries), and the revert/return status.
  3. Prove round-trip lemmas:
    • fromSharedState (toSharedState s) = s for observable fields
    • Storage-write commutation: sstore in EVMYulLean = writeStorage in Verity after round-trip

Phase 3: Semantic bridge proofs (4–8 weeks)

For each statement/expression type, prove that executing via Verity's execYulFuel and via EVMYulLean's exec/eval (after state embedding) produce equivalent observable results:

  1. Expressions (extend existing bridge lemmas to state-dependent builtins):
    • sload: Verity storage lookup = EVMYulLean account storage lookup after state embedding
    • caller, address, callvalue, timestamp, number, chainid, blobbasefee, calldatasize, calldataload: environment field extraction after embedding
    • mload, keccak256: requires memory-model bridge (Phase 2)
  2. Statements (all 8 Yul types already have Layer 3 equivalence proofs; retarget to EVMYulLean):
    • Let, If, For, Switch, Block, ExprStmtCall, Break/Continue/Leave
    • sstore: storage-write commutation
    • mstore: memory-write commutation (requires memory bridge)
    • log0log4: event commutation
    • revert/return: halt-status commutation
  3. Function dispatch: Prove that YulContract dispatch through EVMYulLean's evalCall matches Verity's compiled dispatcher after embedding.
  4. Verity helpers (mappingSlot): Prove that mappingSlot(base, key) = keccak256(abi.encode(key, base)) when evaluated through EVMYulLean's KECCAK256 primop.

Phase 4: Retarget the theorem stack (2–4 weeks)

  1. Replace the Layer 3 compilation target: Preservation.lean's theorem should conclude with EVMYulLean execution rather than Verity's execYulFuel.
  2. Retarget the Layer 2 → Layer 3 bridge: the IR-to-Yul preservation theorem should compose with the new EVMYulLean semantic target.
  3. Update the end-to-end theorem in EndToEnd.lean to state: "EDSL execution is equivalent to EVMYulLean execution of the compiled Yul."
  4. Update TRUST_ASSUMPTIONS.md: remove "EVM/Yul Semantics" as a trusted component; document the new trust boundary as "EVMYulLean's execution model matches the EVM" (backed by conformance tests).

Phase 5: Cleanup and upstream (1–2 weeks)

  1. Remove Verity's custom execYulFuel and related Yul evaluation code (or keep as a reference/test oracle).
  2. Upstream any generally useful changes (new builtins, bug fixes) back to NethermindEth/EVMYulLean.
  3. Update CI: add EVMYulLean conformance tests as a CI job, cross-check EVMYulLean execution against Foundry differential tests.
  4. Update artifacts/evmyullean_capability_report.json and artifacts/evmyullean_adapter_report.json to reflect full semantic integration.

Known Challenges

Challenge Mitigation
Memory model gap: Verity has no byte-level memory; EVMYulLean does Verity only uses memory for ABI encode/decode and calldata; define a constrained memory embedding for this usage pattern
Assign statement: EVMYulLean lacks Assign; adapter uses Let Either add Assign to the fork or prove that re-Let in the same scope is semantically equivalent (Yul spec says it is)
For initializer: EVMYulLean For has no init block Prove that pre-emitting init statements before the loop is semantically equivalent (it is by Yul scoping rules)
Gas model: EVMYulLean uses fuel; Verity doesn't model gas Both use fuel for termination. Gas cost modeling is out of scope for both.
Lean version drift: EVMYulLean may target a different Lean version Pin the fork to Verity's toolchain; upstream toolchain bumps periodically
mappingSlot helper: No EVMYulLean counterpart Define as a Yul-level function (or inline expansion) using EVMYulLean's KECCAK256 + MSTORE

Success Criteria

  • lfglabs-dev/EVMYulLean fork exists and builds on Verity's Lean toolchain
  • Fork passes upstream Ethereum conformance test suite
  • State bridge (ContractState ↔ SharedState .Yul) defined with round-trip lemmas
  • All 15+ state-dependent builtins have universal bridge lemmas
  • End-to-end theorem targets EVMYulLean execution (not Verity's execYulFuel)
  • TRUST_ASSUMPTIONS.md updated to reflect reduced trust surface
  • No regression in make check or differential test pass rate
  • artifacts/evmyullean_capability_report.json shows "status": "full_semantic_integration"

Related Issues

Estimated Effort

10–20 weeks total (can be parallelized across phases 1+2 and phases 3+4).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions