Skip to content

Fix and test fork revert logic #4198

Open
@michaelsproul

Description

@michaelsproul

Description

Lighthouse's mechanism for recovering from missed hard forks is currently broken. It is implemented in the fork_revert module, but hasn't worked since Altair.

Part of the reason for this is that it is hard to test comprehensively. The ideal test depends on access to prior versions of Lighthouse which aren't available on CI (yet). The test looks something like this:

  1. Run a testnet with two types of nodes:
    • Canonical chain: latest Lighthouse version, and fork epoch set for the latest hard fork (e.g. Capella).
    • Stale chain: previous Lighthouse version and no fork epoch set.
  2. Wait until the testnet has advanced past the configured fork epoch. The canonical chain should continue (and finalize) with new blocks, while the stale chain also continues. The nodes will likely disconnect from each other on P2P.
  3. Shut down all the stale nodes and restart them with the latest version of Lighthouse & with the fork epoch configured. Ensure that they don't crash on startup and sync back up to the canonical chain.

There's a lot of local testnet infra described here which is currently not really up to scratch. We likely need changes like #3807 to land first so we can test these scenarios.

There's also a more minor guarantee that we can test on CI without access to prior versions, which is that the current version of Lighthouse can revert a fork missed by the same version. We could likely test this as a beacon_chain test and it would get us ~50% of the way towards a more robust fork_revert module.

TODO

  • Fix fork_revert logic
  • Basic test (single version)
  • Comprehensive test (multiple versions)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestinfra-cimajor-taskA significant amount of work or conceptual task.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions