internal/era: New EraE implementation by shazam8253 · Pull Request #32157 · ethereum/go-ethereum

shazam8253 · 2025-07-07T10:54:58Z

Here is a draft for the New EraE implementation. The code follows along with the spec listed at this link.

MariusVanDerWijden

lint is failing, you can check on your machine with make lint

MariusVanDerWijden · 2025-07-07T10:56:55Z

@@ -0,0 +1,429 @@
+package era2


lightclient

Left a bunch of comments here - we still need to figure out how to call this module since era2 isn't very elegant. In the meantime should rename builder2.go to just builder.go and era2.go to era.go.

lightclient · 2025-07-09T14:22:53Z

+	}
+}
+
+func (b *Builder) Add(header types.Header, body types.Body, receipts types.Receipts, blockhash common.Hash, blocknum uint64, td *big.Int, proof *Proof) error {


blockhash and blocknum are already available via header, so no need to duplicate them

lightclient · 2025-07-09T14:25:15Z

+	}
+}
+
+func (b *Builder) Add(header types.Header, body types.Body, receipts types.Receipts, blockhash common.Hash, blocknum uint64, td *big.Int, proof *Proof) error {


I would separate this into Add and AddRLP

lightclient · 2025-07-09T14:26:55Z

+	headersRLP  [][]byte
+	bodiesRLP   [][]byte
+	receiptsRLP [][]byte
+	proofsRLP   [][]byte


would remove RLP suffix since it is pretty explanatory by the type [][]byte

lightclient · 2025-07-09T15:35:42Z

+	"github.com/klauspost/compress/snappy"
+)
+
+type meta struct {


Suggested change

type meta struct {

type metadata struct {

lightclient · 2025-07-09T15:35:55Z

+type meta struct {
+	start     uint64 // start block number
+	count     uint64 // number of blocks in the era
+	compcount uint64 // number of properties


Suggested change

compcount uint64 // number of properties

components uint64 // number of properties

lightclient · 2025-07-09T15:36:04Z

+	start     uint64 // start block number
+	count     uint64 // number of blocks in the era
+	compcount uint64 // number of properties
+	filelen   int64  // length of the file in bytes


Suggested change

filelen int64 // length of the file in bytes

length int64 // length of the file in bytes

lightclient · 2025-07-09T15:36:36Z

+	mu                                                *sync.Mutex
+	headeroff, bodyoff, receiptsoff, tdoff, proofsoff []uint64 // offsets for each entry type
+	indstart                                          int64
+	rootheader                                        uint64 // offset of the root header in the file if present


what is the use of this?

when reading each era file I load in the index table into cache to make lookup faster, indstart is where the byte where the index table starts, and the root header is where the accumulator root is present when reading so it can seek there quickly since it is its own e2store object. The mutex should be removed though, forgot to do so very early on when I didn't understand what the file was doing I thought I wouldn't want it to read and write at the same time so put a lock.

lightclient · 2025-07-09T15:36:56Z

+	m                                                 meta // metadata for the era2 file
+	mu                                                *sync.Mutex
+	headeroff, bodyoff, receiptsoff, tdoff, proofsoff []uint64 // offsets for each entry type
+	indstart                                          int64


what's this?

lightclient · 2025-07-09T15:39:52Z

Also, please write a description for your PRs and try to fix the CI errors so your code is all green.

MariusVanDerWijden · 2025-07-14T18:20:08Z

+
+func (*BlockProofHistoricalSummariesDeneb) Variant() proofvar { return proofDeneb }
+
+func proofVariantOf(p Proof) proofvar {


Suggested change

func proofVariantOf(p Proof) proofvar {

func variantOf(p Proof) proofvar {

MariusVanDerWijden · 2025-07-14T18:21:25Z

@@ -57,41 +57,33 @@ const (

 type proofvar uint16


Suggested change

type proofvar uint16

type variant uint16

We know that the variant is for Proofs, should be at most a comment, not part of the variable name, otherwise you end up with the types like mev-boost :D

gotcha will make the change :)

lightclient

Left a lot of style nits, but the main problems we'll need to work through are:

utilize proof interface more, avoid handling variant values directly
remove unneeded struct fields
avoid building entire era file in memory, write incremental work to disk

lightclient · 2025-07-17T18:31:36Z

+	buff buffer
+	off  offsets
+
+	prooftype    variant


This should be removed. If you store the proofs as the interface object you will be able to determine the variant at anytime / alternatively if it makes better sense to store the bytes, then the bytes will already have included the variant type so it won't matter anymore.

lightclient · 2025-07-17T18:32:26Z

+	off  offsets
+
+	prooftype    variant
+	tdsint       []*big.Int


This should be removed and tds in buffer should be big.Ints. Serialize them all at once during finalize.

lightclient · 2025-07-17T18:32:59Z

+type Builder struct {
+	w   *e2store.Writer
+	buf *bytes.Buffer
+	sn  *snappy.Writer


Suggested change

sn *snappy.Writer

snappy *snappy.Writer

lightclient · 2025-07-17T18:57:41Z

+}
+
+// retrieves the raw body frame in bytes of a specific block
+func (e *Era) GetRawBodyFrameByNumber(blockNum uint64) ([]byte, error) {


what do you mean by "frame" here?

It is the raw bytes that are written, without snappy decoding it and rlp decoding

I think you should decode the snappy here (and other methods like this) and return the RLP bytes

lightclient · 2025-07-17T18:58:04Z

+	return io.ReadAll(r)
+}
+
+// retrieves the raw receipts frame in bytes of a specific block


Suggested change

// retrieves the raw receipts frame in bytes of a specific block

// GetRawReceiptsFrameByNumber retrieves the raw receipts frame in bytes of a specific block.

lightclient · 2025-07-17T18:58:12Z

+	return io.ReadAll(r)
+}
+
+// retrieves the raw proof frame in bytes of a specific block proof


Suggested change

// retrieves the raw proof frame in bytes of a specific block proof

// GetRawProofFrameByNumber retrieves the raw proof frame in bytes of a specific block proof.

lightclient · 2025-07-17T18:58:31Z

+	return io.ReadAll(r)
+}
+
+// loads in the index table containing all offsets and caches it


Suggested change

// loads in the index table containing all offsets and caches it

// loadIndex loads in the index table containing all offsets and caches it.

lightclient · 2025-07-17T19:00:50Z

+// Getter methods to calculate offset of a specific component in the file.
+func (e *Era) headerOff(num uint64) (uint64, error) { return e.indexOffset(num, compHeader) }
+func (e *Era) bodyOff(num uint64) (uint64, error)   { return e.indexOffset(num, compBody) }
+func (e *Era) rcptOff(num uint64) (uint64, error)   { return e.indexOffset(num, compReceipts) }


Suggested change

func (e *Era) rcptOff(num uint64) (uint64, error) { return e.indexOffset(num, compReceipts) }

func (e *Era) receiptOff(num uint64) (uint64, error) { return e.indexOffset(num, compReceipts) }

Really no reason to ever abbreviate by taking out vowels in golang.

lightclient · 2025-07-22T21:22:07Z

 // following the Era format.
-func ExportHistory(bc *core.BlockChain, dir string, first, last, step uint64) error {
-	log.Info("Exporting blockchain history", "dir", dir)
+func ExportHistory(bc *core.BlockChain, dir string, first, last, step uint64, f ExportFormat) error {


I would not use an enum to dictate the output format, this should be done via the method naming, e.g.

ExportHistoryEra1(..)
ExportHistoryEraE(..)

lightclient · 2025-07-22T21:23:59Z

+	if f == Era1 {
+		filename = era.Filename
+		newBuilder = func(w io.Writer) any { return era.NewBuilder(w) }
+		add = func(b any, blk *types.Block, rcpt types.Receipts, td *big.Int) error {
+			return b.(*era.Builder).Add(blk, rcpt, td)
+		}
+	} else {
+		filename = era2.Filename
+		newBuilder = func(w io.Writer) any { return era2.NewBuilder(w) }
+		add = func(b any, blk *types.Block, rcpt types.Receipts, td *big.Int) error {
+			return b.(*era2.Builder).Add(*blk.Header(), *blk.Body(), rcpt, td, nil)
+		}
+	}


This is kind of impressive, but also is what an Interface is for :)

If you want different builders with the same methods (like Add) you can create the interface type and implement the method for both.

lightclient · 2025-07-22T21:24:26Z

-				receipts := bc.GetReceiptsByHash(block.Hash())
-				if receipts == nil {
-					return fmt.Errorf("export failed on #%d: receipts not found", n)
+				rcpt := bc.GetReceiptsByHash(blk.Hash())


Suggested change

rcpt := bc.GetReceiptsByHash(blk.Hash())

receipts := bc.GetReceiptsByHash(blk.Hash())

lightclient · 2025-07-22T21:24:32Z

+
+			for j := uint64(0); j < step && batch+j <= last; j++ {
+				n := batch + j
+				blk := bc.GetBlockByNumber(n)


Suggested change

blk := bc.GetBlockByNumber(n)

block := bc.GetBlockByNumber(n)

lightclient · 2025-07-22T21:24:38Z

-				)
-				if block == nil {
-					return fmt.Errorf("export failed on #%d: not found", n)
+			bldr := newBuilder(f)


Suggested change

bldr := newBuilder(f)

builder := newBuilder(f)

lightclient · 2025-07-22T21:28:56Z

+	tds      []*big.Int
+}
+
+// The offsets holds the offsets of the different block components in the e2store file. Eventually these offsets will be used to write the index table at the end of the file.


Suggested change

// The offsets holds the offsets of the different block components in the e2store file. Eventually these offsets will be used to write the index table at the end of the file.

// offsets holds the offsets of the different block components in the e2store file. Eventually these offsets will be used to write the index table at the end of the file.

lightclient · 2025-07-22T21:29:40Z

+	buf *bytes.Buffer
+
+	buff buffer


we still have buf and buff, I think we discussed removing buf since it isn't used until the end?

lightclient · 2025-07-22T21:32:36Z

+// Add writes a block entry, its reciepts, and optionally its proofs as well into the e2store file.
+func (b *Builder) Add(header types.Header, body types.Body, receipts types.Receipts, td *big.Int, proof Proof) error {
+	if len(b.buff.headers) == 0 { // first block determines wether proofs are expected
+		b.expectsProofs = proof != nil


i don't think you need to track this explicitly, just check if b.buff.proofs != nil and if proof == nil or vice versa. only special case is first block, which you allow for the b.buff.proofs to be nil even if proof is non-nil.

lightclient · 2025-07-22T21:35:11Z

+		if err != nil {
+			return common.Hash{}, fmt.Errorf("compute accumulator: %w", err)
+		}
+		if n, err := b.w.Write(TypeAccumulatorRoot, accRoot[:]); err != nil {


looks like this still needs to be addressed

lightclient · 2025-07-22T21:36:33Z

+}
+
+// retrieves the raw body frame in bytes of a specific block
+func (e *Era) GetRawBodyFrameByNumber(blockNum uint64) ([]byte, error) {


I think you should decode the snappy here (and other methods like this) and return the RLP bytes

lightclient · 2025-07-24T19:11:35Z

+type Iterator interface {
+	Next() bool
+	Number() uint64
+	Block() (*types.Block, error)
+	Receipts() (types.Receipts, error)
+	Error() error
+}


This interface looks right, but it's in the wrong place. We should define it in the era package. I will take a stab at this.

lightclient · 2025-07-24T19:12:27Z

+func (era1Format) Filename(n string, e int, h common.Hash) string { return era.Filename(n, e, h) }
+func (era1Format) NewBuilder(w io.Writer) Builder                 { return &era1Builder{era.NewBuilder(w)} }
+func (era1Format) ReadDir(dir, net string) ([]string, error)      { return era.ReadDir(dir, net) }
+func (era1Format) NewIterator(f *os.File) (Iterator, error) {
+	e, err := era.From(f)
+	if err != nil {
+		return nil, err
+	}
+	return era.NewIterator(e)
+}


you shouldn't need to redefine these methods to satisfy the interface - if the type implements the interface methods then it can be accepted anywhere the interface is accepted

lightclient · 2025-07-24T19:53:11Z

 // starting from genesis. The assumption is held that the provided chain
 // segment in Era1 file should all be canonical and verified.
-func ImportHistory(chain *core.BlockChain, dir string, network string) error {
+func ImportHistory(chain *core.BlockChain, dir string, network string, format Format) error {


So on this method and ExportHistory we can avoid using this Format type by just passing in the functions we need. For example, here we need a way to read all the entries and a way to create an iterator from an open file. If we just pass those two functions in, we can create an if statement in the cli handling so that we can reuse the function.

The issue with format is it is kind of a superfluous type.

Fixed all issues including extra logic regarding proof types, modularizing some functions and refactoring code for correctness and readability.

…ments

s1na · 2026-01-22T22:14:34Z

-	if err != nil {
+
+	var (
+		format     = ctx.String(utils.EraFormatFlag.Get(ctx))


Suggested change

format = ctx.String(utils.EraFormatFlag.Get(ctx))

format = ctx.String(utils.EraFormatFlag.Name)

s1na · 2026-01-22T22:15:16Z

Also seen this panic:

INFO [01-22|22:13:26.001] export progress                          exported=1,441,792 elapsed=3m9.464s
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x13b779a]

goroutine 1 [running]:
math/big.(*Int).Set(...)
        math/big/int.go:97
github.com/ethereum/go-ethereum/internal/era/execdb.(*Builder).AddRLP(0xc00b68a8f0, {0xc007c2be00, 0x1fc, 0x1fc}, {0xc054b7e0f0, 0x3, 0x3}, {0xc054b7e0f3, 0x1, 0x1}, ...)
        github.com/ethereum/go-ethereum/internal/era/execdb/builder.go:158 +0x6fa
github.com/ethereum/go-ethereum/internal/era/execdb.(*Builder).Add(0xc00b68a8f0, 0xc054959cb0, {0x32ada80, 0x0, 0xc00089b808?}, 0x0)
        github.com/ethereum/go-ethereum/internal/era/execdb/builder.go:119 +0x6ed
github.com/ethereum/go-ethereum/cmd/utils.ExportHistory.func1({0xc053389ba0, 0x1f}, 0x20889d0, 0x162000, 0xc053b7cae8, 0xc00089b808, 0xc008d739c0, {0x7fff1bb876da, 0x3}, 0x20889d8, ...)
        github.com/ethereum/go-ethereum/cmd/utils/cmd.go:493 +0x7ce
github.com/ethereum/go-ethereum/cmd/utils.ExportHistory(0xc00089b808, {0x7fff1bb876da, 0x3}, 0x0, 0x4f25cf, 0x20889d0, 0x20889d8)
        github.com/ethereum/go-ethereum/cmd/utils/cmd.go:516 +0x890
main.exportHistory(0xc00047afc0)
        github.com/ethereum/go-ethereum/cmd/geth/chaincmd.go:583 +0x485
github.com/ethereum/go-ethereum/internal/flags.MigrateGlobalFlags.func2.1(0xc00047afc0)
        github.com/ethereum/go-ethereum/internal/flags/helpers.go:90 +0x34
github.com/urfave/cli/v2.(*Command).Run(0x3269ce0, 0xc00047afc0, {0xc0000c9270, 0x5, 0x5})
        github.com/urfave/cli/v2@v2.27.5/command.go:276 +0x7be
github.com/urfave/cli/v2.(*Command).Run(0xc00030c2c0, 0xc0001d9380, {0xc00003c080, 0x8, 0x8})
        github.com/urfave/cli/v2@v2.27.5/command.go:269 +0xa45
github.com/urfave/cli/v2.(*App).RunContext(0xc000316200, {0x22c4e10, 0x32ada80}, {0xc00003c080, 0x8, 0x8})
        github.com/urfave/cli/v2@v2.27.5/app.go:333 +0x5a5
github.com/urfave/cli/v2.(*App).Run(...)
        github.com/urfave/cli/v2@v2.27.5/app.go:307
main.main()
        github.com/ethereum/go-ethereum/cmd/geth/main.go:287 +0x45

lightclient

LGTM!

s1na

LGTM

lightclient · 2026-02-09T15:29:47Z

Export on mainnet completed in ~22 hours, import in ~62 hours. Definitely would be good to look into speeding up import some, but overall this is ready to come into geth. Thanks @shazam8253 for your hard work on this last summer - sorry it took so long to finally get in! Thank you @s1na for pushing it over the edge. I've been treading water here for too long.

shazam8253 · 2026-02-09T18:09:37Z

@lightclient @s1na Thank you guys for pushing this through, super happy to see this! Had a great time this summer, and hopefully when I have more bandwidth I can make a PR again soon.

This PR allows users to prune their nodes up to the Prague fork. It indirectly depends on #32157 and can't really be merged before eraE files are widely available for download. The `--history.chain` flag becomes mandatory for `prune-history` command. Here I've listed all the edge cases that can happen and how we behave: ## prune-history Behavior | From | To | Result | |-------------|--------------|--------------------------| | full | postmerge | ✅ prunes | | full | postprague | ✅ prunes | | postmerge | postprague | ✅ prunes further | | postprague | postmerge | ❌ can't unprune | | any | all | ❌ use import-history | ## Node Startup Behavior | DB State | Flag | Result | |-------------|--------------|----------------------------------------------------------------| | fresh | postprague | ✅ syncs from Prague | | full | postprague | ❌ "run prune-history first" | | postmerge | postprague | ❌ "run prune-history first" | | postprague | postmerge | ❌ "can't unprune, use import-history or fix flag" | | pruned | all | ✅ accepts known prune points |

MariusVanDerWijden reviewed Jul 7, 2025

View reviewed changes

lightclient reviewed Jul 9, 2025

View reviewed changes

lightclient self-assigned this Jul 10, 2025

MariusVanDerWijden reviewed Jul 14, 2025

View reviewed changes

lightclient reviewed Jul 17, 2025

View reviewed changes

lightclient reviewed Jul 22, 2025

View reviewed changes

lightclient reviewed Jul 24, 2025

View reviewed changes

lightclient requested a review from rjl493456442 as a code owner July 24, 2025 20:01

lightclient mentioned this pull request Aug 12, 2025

internal/era: refactor to use slices.Reverse #32399

Closed

lightclient force-pushed the era2implementation branch from 828377d to a85cbd5 Compare August 14, 2025 15:55

MariusVanDerWijden self-assigned this Aug 26, 2025

s1na force-pushed the era2implementation branch from 1c247de to b0dadc3 Compare January 19, 2026 21:52

s1na changed the title ~~Draft: New EraE implementation~~ internal/era: New EraE implementation Jan 20, 2026

shantichanal and others added 13 commits January 21, 2026 10:09

Working on implementation

ad36405

updated some things made a section writer

2609e64

finished builder

f5a274b

readers for single reads

bfb83f3

sequential access completed (without iterator)

730bbb7

simplified proof builder structure

4496825

adding testing and refining functions

fd11aa1

refactored and updated e2store framing for objects

65a0832

Updated with all comments.

02b439c

Fixed all issues including extra logic regarding proof types, modularizing some functions and refactoring code for correctness and readability.

Implemented the proof interface, refactored code, and implemented com…

ff0837a

…ments

formatting changes and lint

4acb3c4

internal/era2: add correct license headers

d3d99b0

added cmd

45f5b3e

cmd/utils: better variable naming

c967b10

s1na mentioned this pull request Jan 21, 2026

cmd/geth: add Prague pruning points #33657

Merged

Fix export history flag read

e6e4d7a

s1na reviewed Jan 22, 2026

View reviewed changes

s1na and others added 8 commits January 26, 2026 15:04

Fix difficulty check around transition

14f5a15

fix test

5cd8989

satisfy linter

438e4c5

fix era verify cmd

890745e

internal/era: add copyright text

efb2870

internal/era: remove extra comment in imports

2dafd06

cmd/utils: defer close in ImportHistory

9755543

internal/era: remove empty iterator file

223b5bc

lightclient previously approved these changes Jan 30, 2026

View reviewed changes

lightclient dismissed their stale review via 223b5bc January 30, 2026 19:52

lightclient previously approved these changes Jan 30, 2026

View reviewed changes

touch-ups

eda6bc4

s1na dismissed lightclient’s stale review via eda6bc4 February 2, 2026 12:09

s1na added 3 commits February 2, 2026 16:12

lint

adc2337

introduce slimReceipt type

ca967ad

fix export history for windows

a134b18

s1na approved these changes Feb 9, 2026

View reviewed changes

lightclient approved these changes Feb 9, 2026

View reviewed changes

lightclient merged commit c9b7ae4 into ethereum:master Feb 9, 2026
7 of 8 checks passed

lightclient added this to the 1.17.0 milestone Feb 9, 2026

s1na mentioned this pull request Feb 10, 2026

History expiry tracker #33809

Open

5 tasks

lightclient mentioned this pull request Feb 11, 2026

Add eraE file format eth-clients/e2store-format-specs#16

Open

	compcount uint64 // number of properties
	components uint64 // number of properties

	filelen int64 // length of the file in bytes
	length int64 // length of the file in bytes


		func (*BlockProofHistoricalSummariesDeneb) Variant() proofvar { return proofDeneb }

		func proofVariantOf(p Proof) proofvar {

	func proofVariantOf(p Proof) proofvar {
	func variantOf(p Proof) proofvar {

	// retrieves the raw receipts frame in bytes of a specific block
	// GetRawReceiptsFrameByNumber retrieves the raw receipts frame in bytes of a specific block.

	// retrieves the raw proof frame in bytes of a specific block proof
	// GetRawProofFrameByNumber retrieves the raw proof frame in bytes of a specific block proof.

	// loads in the index table containing all offsets and caches it
	// loadIndex loads in the index table containing all offsets and caches it.

	func (e *Era) rcptOff(num uint64) (uint64, error) { return e.indexOffset(num, compReceipts) }
	func (e *Era) receiptOff(num uint64) (uint64, error) { return e.indexOffset(num, compReceipts) }

	rcpt := bc.GetReceiptsByHash(blk.Hash())
	receipts := bc.GetReceiptsByHash(blk.Hash())

	blk := bc.GetBlockByNumber(n)
	block := bc.GetBlockByNumber(n)

	// The offsets holds the offsets of the different block components in the e2store file. Eventually these offsets will be used to write the index table at the end of the file.
	// offsets holds the offsets of the different block components in the e2store file. Eventually these offsets will be used to write the index table at the end of the file.

	format = ctx.String(utils.EraFormatFlag.Get(ctx))
	format = ctx.String(utils.EraFormatFlag.Name)

Conversation

shazam8253 commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MariusVanDerWijden left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lightclient left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lightclient commented Jul 9, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lightclient left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shazam8253 commented Jul 7, 2025 •

edited

Loading