Skip to content

Fix and test fork revert logic #5100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 15 commits into
base: unstable
Choose a base branch
from

Conversation

danielrachi1
Copy link
Contributor

@danielrachi1 danielrachi1 commented Jan 20, 2024

Issue Addressed

#4198

Proposed Changes

wip

@dapplion dapplion added the work-in-progress PR is a work-in-progress label Jan 29, 2024
@danielrachi1 danielrachi1 force-pushed the fix_fork_revert branch 9 times, most recently from 0ccf80e to cdc8416 Compare February 15, 2024 00:18
@danielrachi1 danielrachi1 reopened this Mar 8, 2024
@danielrachi1 danielrachi1 force-pushed the fix_fork_revert branch 2 times, most recently from 4f1bcf1 to cfe51b1 Compare March 9, 2024 16:04
@danielrachi1
Copy link
Contributor Author

danielrachi1 commented Mar 13, 2024

I've made some progress upgrading the local testnet infra that would be necessary to solve the issue. But I ran into some problems, and don't know where to go from here. My intent with this comment is to explain what I'm trying to do and hope that someone points to the mistakes that I'm making.

A tl;dr of the fork_revert.sh script:

  1. Start a local testnet with the default config (deneb configured). This will handle the initialization of the bootnodes and 4 validator nodes. This is done by calling start_local_testnet.sh
  2. Download the last lighthouse release that wasn't ready for deneb. That's v4.6.0.
  3. Copy the testnet directory of the network started in step 1, and delete the lines that configure the deneb hard fork from config.yaml. This will be the stale testnet directory.
  4. Start two nodes which:
    • Are running lighthouse v4.6.0
    • --testnet-dir is set to the stale testnet directory.
  5. Wait for the network to reach the deneb hardfork slot (64).
  6. After some extra slots, shut the stale nodes down and restart them with:
    • The lighthouse version being tested (the one installed with make, using the current source code).
    • --testnet-dir set to the original testnet directory (the one where the deneb lines were not deleted in config.yaml)

This is pretty much the test that Michael described here: #4198 (comment). I think I'm close to getting it to work, but I'm getting some unexpected behavior.

After step 4, the non-stale nodes' logs will include:

Mar 13 21:35:31.501 INFO Ready for Capella                       info: ensure the execution endpoint is updated to the latest Capella/Shanghai release, service: slot_notifier
Mar 13 21:35:31.502 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier

and the stale nodes will only include:

Mar 13 21:35:58.500 INFO Ready for Capella                       info: ensure the execution endpoint is updated to the latest Capella/Shanghai release, service: slot_notifier

With no mention of deneb, as expected!

We wait until slot 64, the non-stale nodes will log:

Mar 13 21:37:45.002 INFO Transitioned to new fork                new_fork: Deneb, old_fork: Capella, service: network
Mar 13 21:37:45.002 INFO Updating the ENR fork version           next_fork_epoch: No other fork, next_fork_version: [4, 0, 0, 0], fork_digest: [199, 194, 191, 252], service: libp2p
Mar 13 21:37:45.002 INFO Network will unsubscribe from old fork gossip topics in a few epochs, remaining_epochs: 2, service: network
Mar 13 21:37:45.042 INFO New block received                      root: 0x0e033ee85db45e8bdd147aac51f9d58cfac351e3667f48e8967fc3cc4b50f7e4, slot: 64
Mar 13 21:37:46.501 INFO Synced                                  slot: 64, block: 0x0e03…f7e4, epoch: 2, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xebde…e6bb (verified), peers: 4, service: slot_notifier

And the stale nodes will not log anything related to deneb but their updates will start looking like this:

Mar 13 21:37:43.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:37:46.500 INFO Synced                                  slot: 64, block:    …  empty, epoch: 2, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe1c7…f52c (verified), peers: 4, service: slot_notifier

I think this is the expected behavior, since they missed the fork, they will keep advancing the slots but won't get info about any new block.

After the timeout ends, they get updated and reconfigured. I was expecting to get something like this: #4188 (comment). But instead, they just... recover?

Mar 13 21:38:31.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:38:34.500 INFO Synced                                  slot: 80, block:    …  empty, epoch: 2, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe1c7…f52c (verified), peers: 4, service: slot_notifier
Mar 13 21:38:34.500 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:38:35.317 INFO Shutting down..                         reason: Success("Received SIGTERM")
Mar 13 21:38:35.319 INFO Saved DHT state                         service: network
Mar 13 21:38:35.319 INFO Network service shutdown                service: network
Mar 13 21:38:35.320 INFO Saved beacon chain to disk              service: beacon
tail: /home/daniel/.lighthouse/local-testnet/testnet/beacon_node_5.log: file truncated
./beacon_node.sh -d info /home/daniel/.lighthouse/local-testnet/node_5 9005 9105 8005 http://localhost:5005 /home/daniel/.lighthouse/local-testnet/geth_datadir5/geth/jwtsecret /home/daniel/.lighthouse/local-testnet/testnet >> /home/daniel/.lighthouse/local-testnet/testnet/beacon_node_5.log 2>&1
Mar 13 21:38:40.338 INFO Logging to file                         path: "/home/daniel/.lighthouse/local-testnet/node_5/beacon/logs/beacon.log"
Mar 13 21:38:40.349 INFO Lighthouse started                      version: Lighthouse/v5.1.1-d8ae9b2
Mar 13 21:38:40.349 INFO Configured for network                  name: custom (/home/daniel/.lighthouse/local-testnet/testnet)
Mar 13 21:38:40.349 INFO Data directory initialised              datadir: /home/daniel/.lighthouse/local-testnet/node_5
Mar 13 21:38:40.350 WARN Discv5 packet filter is disabled
Mar 13 21:38:40.350 WARN Running HTTP server on port 8005
Mar 13 21:38:40.354 INFO Deposit contract                        address: 0x4242424242424242424242424242424242424242, deploy_block: 0
Mar 13 21:38:40.557 INFO Blob DB initialized                     oldest_blob_slot: Some(Slot(64)), path: "/home/daniel/.lighthouse/local-testnet/node_5/beacon/blobs_db", service: freezer_db
Mar 13 21:38:40.564 INFO Starting beacon chain                   method: resume, service: beacon
Mar 13 21:38:40.578 INFO Block production enabled                method: json rpc via http, endpoint: Auth { endpoint: "http://localhost:5005/", jwt_path: "/home/daniel/.lighthouse/local-testnet/geth_datadir5/geth/jwtsecret", jwt_id: None, jwt_version: None }
Mar 13 21:38:40.653 INFO Beacon chain initialized                head_slot: 63, head_block: 0x8379…6e28, head_state: 0xb6ed…51ee, service: beacon
Mar 13 21:38:40.653 INFO Timer service started                   service: node_timer
Mar 13 21:38:40.653 INFO UPnP Attempting to initialise routes
Mar 13 21:38:40.654 INFO ENR Initialised                         quic6: None, quic4: Some(9105), udp6: None, tcp6: None, tcp4: Some(9005), udp4: Some(9005), ip4: Some(127.0.0.1), id: 0x485d..319b, seq: 6, enr: enr:-MS4QGVDtb3jBLz7JvfNHS8V7Ht0zm_iehTJKoIsTg1Jy3jxW4cPpoOqnP6sHDYVU3ZlZY8T5vE3Xe8ZiIoouuFF1wQGh2F0dG5ldHOIAAAAAAAAAACEZXRoMpDHwr_8BAAAAP__________gmlkgnY0gmlwhH8AAAGEcXVpY4IjkYlzZWNwMjU2azGhAjhD25Zn0y51D_OzyLkK-yUa7-rOLM3hZ0yN1rAkXjJfiHN5bmNuZXRzAIN0Y3CCIy2DdWRwgiMt, service: libp2p
Mar 13 21:38:40.655 INFO Libp2p Starting                         bandwidth_config: 3-Average, peer_id: 16Uiu2HAkyDJpny7vurf1tqaqACtnaiqxiwLL785HAb2J7MDp6UwY, service: libp2p
Mar 13 21:38:40.655 INFO Listening established                   address: /ip4/0.0.0.0/tcp/9005/p2p/16Uiu2HAkyDJpny7vurf1tqaqACtnaiqxiwLL785HAb2J7MDp6UwY, service: libp2p
Mar 13 21:38:40.655 INFO Listening established                   address: /ip4/0.0.0.0/udp/9105/quic-v1/p2p/16Uiu2HAkyDJpny7vurf1tqaqACtnaiqxiwLL785HAb2J7MDp6UwY, service: libp2p
Mar 13 21:38:40.655 INFO Deterministic long lived subnets enabled, subscription_duration_in_epochs: 256, subnets_per_node: 2, service: attestation_service
Mar 13 21:38:40.655 INFO Subscribing to long-lived subnets       subnets: [SubnetId(58), SubnetId(59)], service: attestation_service
Mar 13 21:38:40.657 INFO HTTP API started                        listen_address: 127.0.0.1:8005
Mar 13 21:38:40.658 INFO Execution engine online                 service: exec
Mar 13 21:38:40.658 INFO Issuing forkchoiceUpdated               forkchoice_state: ForkchoiceState { head_block_hash: 0xe1c77090c2c17822b8faa8cff2c0d6643a411b13f2516839e76a449b5e60f52c, safe_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, finalized_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000 }, service: exec
Mar 13 21:38:40.686 INFO UPnP route established                  addr: /ip4/201.245.253.1/udp/9105/quic-v1, service: libp2p
Mar 13 21:38:43.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:38:43.501 INFO Searching for peers                     current_slot: 83, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:38:43.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:38:43.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:38:46.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:38:46.501 INFO Searching for peers                     current_slot: 84, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:38:46.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:38:46.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:38:49.500 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:38:49.501 INFO Searching for peers                     current_slot: 85, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:38:49.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:38:49.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:38:52.500 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:38:52.500 INFO Searching for peers                     current_slot: 86, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:38:52.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:38:52.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:38:55.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:38:55.501 INFO Searching for peers                     current_slot: 87, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:38:55.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:38:55.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:38:58.500 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:38:58.501 INFO Searching for peers                     current_slot: 88, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:38:58.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:38:58.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:01.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:39:01.501 INFO Searching for peers                     current_slot: 89, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:39:01.502 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:01.502 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:04.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:39:04.501 INFO Searching for peers                     current_slot: 90, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:39:04.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:04.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:07.501 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:39:07.501 INFO Searching for peers                     current_slot: 91, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:39:07.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:07.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:10.500 WARN Low peer count                          peer_count: 0, service: slot_notifier
Mar 13 21:39:10.500 INFO Searching for peers                     current_slot: 92, head_slot: 63, finalized_epoch: 0, finalized_root: 0xd840…a551, peers: 0, service: slot_notifier
Mar 13 21:39:10.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:10.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:10.656 INFO Sync state updated                      new_state: Synced, old_state: Stalled, service: sync
Mar 13 21:39:10.656 INFO Subscribed to topics                    topics: ["/eth2/c7c2bffc/blob_sidecar_0/ssz_snappy", "/eth2/c7c2bffc/blob_sidecar_1/ssz_snappy", "/eth2/c7c2bffc/blob_sidecar_2/ssz_snappy", "/eth2/c7c2bffc/blob_sidecar_3/ssz_snappy", "/eth2/c7c2bffc/blob_sidecar_4/ssz_snappy", "/eth2/c7c2bffc/blob_sidecar_5/ssz_snappy", "/eth2/c7c2bffc/bls_to_execution_change/ssz_snappy", "/eth2/c7c2bffc/sync_committee_contribution_and_proof/ssz_snappy", "/eth2/c7c2bffc/beacon_block/ssz_snappy", "/eth2/c7c2bffc/beacon_aggregate_and_proof/ssz_snappy", "/eth2/c7c2bffc/voluntary_exit/ssz_snappy", "/eth2/c7c2bffc/proposer_slashing/ssz_snappy", "/eth2/c7c2bffc/attester_slashing/ssz_snappy"], service: network
Mar 13 21:39:13.501 WARN Low peer count                          peer_count: 1, service: slot_notifier
Mar 13 21:39:13.501 INFO Synced                                  slot: 93, block:    …  empty, epoch: 2, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe1c7…f52c (verified), peers: 1, service: slot_notifier
Mar 13 21:39:13.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:13.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:16.500 WARN Low peer count                          peer_count: 1, service: slot_notifier
Mar 13 21:39:16.500 INFO Synced                                  slot: 94, block:    …  empty, epoch: 2, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe1c7…f52c (verified), peers: 1, service: slot_notifier
Mar 13 21:39:16.500 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:16.500 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:19.501 WARN Low peer count                          peer_count: 1, service: slot_notifier
Mar 13 21:39:19.501 INFO Synced                                  slot: 95, block:    …  empty, epoch: 2, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe1c7…f52c (verified), peers: 1, service: slot_notifier
Mar 13 21:39:19.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:19.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:22.501 WARN Low peer count                          peer_count: 1, service: slot_notifier
Mar 13 21:39:22.501 INFO Synced                                  slot: 96, block:    …  empty, epoch: 3, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe1c7…f52c (verified), peers: 1, service: slot_notifier
Mar 13 21:39:22.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:22.502 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:25.500 WARN Low peer count                          peer_count: 1, service: slot_notifier
Mar 13 21:39:25.501 INFO Synced                                  slot: 97, block:    …  empty, epoch: 3, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe1c7…f52c (verified), peers: 1, service: slot_notifier
Mar 13 21:39:25.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:25.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:28.500 WARN Low peer count                          peer_count: 1, service: slot_notifier
Mar 13 21:39:28.500 INFO Synced                                  slot: 98, block:    …  empty, epoch: 3, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe1c7…f52c (verified), peers: 1, service: slot_notifier
Mar 13 21:39:28.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:28.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:31.259 INFO Sync state updated                      new_state: Syncing Head Chain, old_state: Synced, service: sync
Mar 13 21:39:31.501 INFO Syncing                                 est_time: --, distance: 36 slots (1 min), peers: 2, service: slot_notifier
Mar 13 21:39:31.501 WARN Syncing deposit contract block cache    est_blocks_remaining: 15966563, service: slot_notifier
Mar 13 21:39:31.501 INFO Ready for Deneb                         info: ensure the execution endpoint is updated to the latest Deneb/Cancun release, service: slot_notifier
Mar 13 21:39:31.890 INFO Sync state updated                      new_state: Synced, old_state: Syncing Head Chain, service: sync
Mar 13 21:39:33.053 INFO New block received                      root: 0xf9da5ed2d34ce60f36d1fed9312700484412c69ca487d20b74c339a820920454, slot: 100
Mar 13 21:39:34.501 INFO Synced                                  slot: 100, block: 0xf9da…0454, epoch: 3, finalized_epoch: 0, finalized_root: 0xd840…a551, exec_hash: 0xe0a0…e757 (verified), peers: 2, service: slot_notifier

I guess that I'm misunderstanding how the fork transition is defined; and that deleting the deneb lines on the config.yaml file is not enough. But I don't know how else I could do it.

Is there something obviously wrong with my approach?

@michaelsproul
Copy link
Member

@danielramirezch I think the missing ingredient is a block on the stale chain for the wrong fork version. That's what will mess up the DB badly, because the head block will be at the wrong fork. If there are just empty slots on the stale chain then there's nothing in need of reversion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
work-in-progress PR is a work-in-progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants