[State] Content address account code #1024
Description
With #1020 we will add EVM support for EXTCODEHASH
and #1000 should make it fairly straight forward to maintain a separate tree of code stored by the hash of bytecode. For some sets of contracts this should save a fair amount of storage and simplifies providing on-chain ABI storage: #858.
Some considerations:
-
We will need to change the
account.Code()
to store a reference. We could just store the hash as bytes, but at this stage we might do well to consider the possibility of alternative bytecode formats (i.e. WASM bytecode). Even then we could still use a single consistent hash, but we may want to use a self-describing hash. I have looked at https://github.com/multiformats/multihash in the past. -
Currently deleting a contract deletes its code (well... it would if we didn't keep every previous version of our forest - perhaps this in itself makes the point moot - we have considered abbreviating our versioning - i.e. thinning to snapshots in distant past). Under this model we would have a garbage collection issue - i.e. knowing when the last reference has dropped. Minor issue but we should note our approach.
-
With a view to establishing a link between code hash and EVM ABI it would be useful to understand under which circumstances marginally different bytecode (i.e. solidity writes a metadata footer to the EVM code) actually maps to the same ABI. I suppose we can just duplicate registration/storage of the ABI. What about reverse lookup? Do we care?