Skip to content

Add a new distributable metadata file for describing snapshots settings + define step size/frozen limit as first settings #18082

@wmitsuda

Description

@wmitsuda

This is the same proposal which circulated internally some time ago, now formally being created as an issue because we'll need it if we are going to change the step size for ethmainnet on 3.4.


Proposal: erigondb.toml

Introduction

This document presents a proposal for a new metadata file: erigondb.toml.

Why?

The ongoing experiment of changing the step size gaves us a tool for changing the DB geometry, but as a result, it requires the step size and step in frozen files parameters to be changed in the instrumented datadir.

As a result, we had to implement some CLI parameters to allow those values to be overriden.

Problems with this approach:

  1. It requires the user to remember such parameters and add it to some script. Failing to inform those parameters at start may lead to DB corruption.
  2. Once datadir DB geometry is changed, those parameters are permanent, it makes more sense to have them persisted somewhere in datadir.
  3. Tools that we are developing may rely on DB geometry, but if it is not persisted, they need to be informed by the user, which is error prone and may lead to DB corruption.

Proposal

Introduce a new metadata file called erigondb.toml at $DATADIR/snapshots. It is meant to be distributed by Ottersync, hence it is stored on /snapshots and not at the root.

This file is not meant to be modified by humans, but created by Erigon itself at datadir initialization and modified by tools that we develop to modify the DB geometry.

At first, there are 2 parameters that we want it to contain:

  1. Step size (amount of txnums that compose 1 step)
  2. Step in frozen files (amount of steps that make a frozen file; not merged anymore)

How does it work

If the datadir doesn't contain this file it means either:

  • New datadir
  • Existing legacy datadir

New datadir

The file is created with the default values. Erigon binaries will never assume defaults because defaults can change over releases, so it is safer to use the binaries defaults to store the DB metadata at initialization.

Existing legacy datadir

In case of existing datadir, not having that file present means it is a E3+ datadir created before the Erigon release of this feature.

In this case the erigondb.toml file will also be created with the defaults. It is important to note that the plan is to rollout this feature ASAP so we ensure every erigon user has those settings persisted because they were forced to upgrade Erigon by some hardfork.

After 1 Erigon release + 1 Ethereum hardfork we can be sure every user has that file created in their datadir, and from now on, we are safe to change the defaults if needed (e.g., we realize smaller step sizes are better).

Questions

Why a file instead of DB?

File is human readable (although this file should not be modified by humans). We can't store it on chaindata because chaindata is ephemeral.

Creating another mdbx DB just to store settings feels like overengineering and hard to read what it contains.

Possible expansions

Having such file will allow us to more easily add finer granularity settings to our snapshots, e.g., may we decide that only history may have a bigger merge limit and we could allow that by having a subsection in .toml of specify finer settings.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions