Skip to content

Proposed attnets revamp #2749

Open
Open
@djrtwo

Description

@djrtwo

Attnets revamp

Since the launch of the beacon chain, the "backbone" of attestation subnets (attnets) relies upon staking nodes connecting to a random set of subnets. The size of this random set is dictated by the quantity of validators attached to the node, up to the maximum of 64 (ATTESTATION_SUBNET_COUNT which maps to the expected SHARD_COUNT). The general idea at genesis was that a node's validator requirments will scale linearly when sharding is released thus we can put this linear subnet requirement for subnet backbones as an "honesty" assumption until sharding comes around.

An attestation subnet backbone is required so that at each epoch, validators can quickly find and publish to their assigned subnet. If there was no notion of persistence in these subnets, then there would be no subnet to "find" in he ~6 minute window and those no clear place to publish individual attestations before they are aggregated.

Backbone requirements:

  • Subnets are relatively stable, slot-to-slot and epoch-to-epoch, allowing for reliable dissemination of messages on demand
  • Subnets entry points can be found in a relatively short time (< 1 minute) and short search overhead (within a few hops of the DHT)
  • (Nice to have) A method to discern whether a node is "honestly" performing the expected backbone duty

Problems

There are a few issues with the current structure:

  1. This likely creates overly populated subnets in actuality, thus increasing the network's total bandwidth consumption with little to no gain
  2. This relies on an unenforcable "honesty" of validator nodes when the rational behavior is to turn your attnets down to 1 or even 0.
  3. As non-staking (user nodes) come online more and more, such (0-attnet) nodes will crowd the DHT, making the task of finding peers of particular subnets increasingly difficult. In the event that user nodes outpace staking nodes by 10-to-1 (a situation that should be applauded!), finding attestation backbones would become 10x more difficult.

Proposed solution

In an effort to solve the above issues, we propose:

  • Remove random subnets per validator
  • Add a single deterministic subnet per node, as a function of by the node_id, epoch

Rather than putting the backbone requirement on a brittle validator-honesty assumption, this puts the backbone requirement on the entire network of full nodes such that, on average one out of every ATTESTATION_SUBNET_COUNT nodes will be of a particular subnet.

This means that the size of subnets becomes a function of the total number of nodes on the network, rather than on staking node count combined with validator-node density. This means that we simultaneously reduce the over population of attnets by a small number of staking nodes and ensure that even if the Ethereum network (and DHT) grows by orders of magnitude, attnets will be able to be found within a few hops.

Additionally, due to the requisite subnet subscription being a function of a node's peer-id, honest/dishonesty wrt attnet backbone can be deterministically assessed allowing for peer downscoring and disconnects of dishonest peers.

The downside to this approach is that it puts a minimum of one attnet of load on every node rather than just on staking nodes, but in estimation, this is not a very high burden wrt home node resources and provides much more benefit in meeting the backbone requirements than negative.

Concrete spec mods

Remove random subscriptions

Add node-id subscription

  • create function compute_subnet_from_node_id(node_id) -> uint64 that takes in a node_id and returns a value on [0, ATTESTATION_SUBNET_COUNT). Consider an epoch param that causes these subscriptions to slowly rotate on the order of ~1 day
  • Add "MAY downscore peers that do no actively subscribe/participate in their currently assigned subnet based on compute_subnet_from_node_id
  • In Lookahead section, replace attnets dht search with compute_subnet_from_node_id dht search

Strategy

We'd likely want to simulate and test this change in strategy in a controlled environment before pushing this to testnets and then mainnet.

Such a controlled environmnt to test gossipsub at scale seems critical to a number of the network optimization investigations underway.

Protocol Lab's testground could be a good candidate. Alternatively, another simulation framework or even spinning up 1k+ distributed networks for ~1 day tests could also be a viable path.

EDITED TO USE node_id instead of peer_id per discussions below

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions