Skip to content

network: Investigate sudden sync peer count drops #5236

@lexnv

Description

@lexnv

Investigate sudden sync peer count drops.
Screenshot 2024-08-05 at 12 19 43

Over the past few days, the libp2p node (yellow) is more susceptible to peer count drops than litep2p (green).

This may be a side effect from the fact that litep2p (1.4k vs 8 for libp2p) submits more kademlia random queries to keep a healthy view of the network:
Screenshot 2024-08-05 at 11 42 44

Libp2p logs:

2024-08-03 03:06:51.000 ERROR tokio-runtime-worker beefy: 🥩 Error: ConsensusReset. Restarting voter.    
2024-08-03 03:06:51.033  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/56904/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:52.800  INFO tokio-runtime-worker substrate: 💤 Idle (51 peers), best: #24317220 (0x7649…c2c7), finalized #24317216 (0xb9e7…97f4), ⬇ 2.6MiB/s ⬆ 937.8kiB/s    
2024-08-03 03:06:53.670  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/51336/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:53.816  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/53266/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:54.395  INFO tokio-runtime-worker substrate: 🏆 Imported #24317221 (0x7649…c2c7 → 0x1792…fe76)    
2024-08-03 03:06:56.737  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip6/2001:bc8:701:700:3eec:efff:feff:183c/tcp/57534/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:57.456  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/43108/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:06:57.800  INFO tokio-runtime-worker substrate: 💤 Idle (51 peers), best: #24317221 (0x1792…fe76), finalized #24317217 (0xb2dd…2d65), ⬇ 3.8MiB/s ⬆ 1.1MiB/s    
2024-08-03 03:06:59.917  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/35934/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:00.486  INFO tokio-runtime-worker substrate: 🏆 Imported #24317222 (0x1792…fe76 → 0x9626…cfa5)    
2024-08-03 03:07:01.962  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip6/2001:bc8:701:700:3eec:efff:feff:183c/tcp/57784/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:02.136  INFO tokio-runtime-worker beefy: 🥩 BEEFY gadget waiting for BEEFY pallet to become available...    
2024-08-03 03:07:02.137  INFO tokio-runtime-worker beefy: 🥩 BEEFY pallet available: block 24316220 beefy genesis 21943872    
2024-08-03 03:07:02.798  INFO tokio-runtime-worker sub-libp2p: 🔍 Discovered new external address for our node: /ip4/62.210.158.17/tcp/39836/p2p/12D3KooWHWtiKtyve8W2DM9J2i5t6omLVurJugjKGijikon1SqeL    
2024-08-03 03:07:02.807  INFO tokio-runtime-worker substrate: 💤 Idle (27 peers), best: #24317222 (0x9626…cfa5), finalized #24317219 (0x20bc…fadc), ⬇ 4.6MiB/s ⬆ 3.0MiB/s    

No banned peers where reported during the count spikes.

There's been an instance where I've seen a large gap in logs produced by libp2p (around 2 minutes iirc).
This could be due to some subsystem performing a blocking task in a non-blocking async function.
Prometheus reports by default metrics at 15 seconds intervals. We may drop all connected peers, then slowly connect again to the network in this interval.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions