P2P / Networking — Implementation of Perigee p2p Routing. by D-Stacks · Pull Request #796 · kaspanet/rusty-kaspa

D-Stacks · 2025-12-25T21:59:33Z

For main reference a corresponding paper can be found here: https://arxiv.org/abs/2006.14186

please note the comment here: #796 (comment)

[open for review]

This branch is a "more" stable and tested version, I also maintain a dev-branch here: https://github.com/D-Stacks/rusty-kaspa/tree/perigee-dev but as a general disclaimer I would like to mention both are currently considered beta and work-in-progress, just in the dev-branch things might change and break more quickly, and I won't keep an updated change-log. this branch has been merged in.

Implementation scores neighbors via a joint Subset Scoring. Utilizing the empirical 90th percentile (tie-breaking on the 95th and 97.5th), the top perigee peer is chosen, it's individual delay scores, where it is the top performing peer, are removed from subsequent scoring of the remaining peers which are in return rated on this scoring of remaining individual delay scores. This process is rinse and repeated until the specified amount of peers to leverage are chosen. In cases where the algo cannot find a new best peer for the subset, the loop re-starts from scratch. This ensures that the leveraged perigee peers are chosen holistically, minimizing delay to different parts of the block-producing network and not scored individually with overlap in respect to one-another.
Rounds are encapsulated within the connection manager's event loop, which means that they have a granularity of 30 seconds, specifying (--blk-perigee-duration=x) will trigger rounds every x secs [clamped between 30 and 300, rounded to the nearest 30 secs]. Additionally, if for whatever reason the amount of perigee peers overflows the perigee peer target trimming logic will be applied, to ensure target consistency, within the connection manager's handle_outbound_connections.
Delays are measured in respect to the node's first seen timestamp, the paper also offers a method of using the block creation timestamp, which is not the method used here.
Some delays might be calculated erroneously, specifically at perigee round boundaries, for example block hash A might be registered from peer A, in perigee round A, while peer B might send it in perigee round B. As a rule perigee will only evaluate on blocks that were consensus verified within the round.. timestamps that enter the wrong round will be ignored, and as with all missing timestamps, will be evaluated as u64::MAX . But in practicality this can probably be ignored as mere "noise", and these will be filtered out via the usage of the empirical percentiles as the scoring mechanism.
Parameters such as number of perigee peers (--blk-perigee-peers=<usize>), exploration rate(--blk-perigee-exploration=<usize>), leverage rate (--blk-perigee_leverage=<usize>) are all provided via the command line, not hard coded. Specifying values which do not result in enough exploration, or leverage, to maintain perigee will cause a panic. default values for leverage and exploration will be 50% and 25% of the perigee peers, and as such are guaranteed to work.
Current routing, can be (even partially) preserved, i.e. specifying --blk-perigee-peers=<usize> < outpeers=<usize>, will leave the remaining peers under the current random graph routing paradigm, using both routings side-by-side.
Statistics are exposed via --blk-perigee-statistics, this also automatically logs random graph routed peer timestamps, and compares these to the perigee outbound peers, for the sake of fair statistical comparison. Ideally a node should be run in this scenario with an amount of blk-perigee-peers == outpeers / 2.
Currently field testing i am getting around 30-80% decrease in observed delay times of signaled blocks compared to random graph routing. Although this might decrease significantly if perigee is more widely adopted (as more perigee nodes will compete for, and cluster around, well-connected and block-producing nodes), or as with the case of public nodes, with a lot of inbound peers, a larger sample size of first seen timestamps can be used to calculate delays.
Some performance characteristics i gathered: On my machine with the start args ./kaspad --loglevel=info,kaspa_perigeemanager=debug --outpeers=16 --blk-perigee-peers=8 --blk-perigee-stats --blk-perigee-duration=60 a typical perigee round with about 600 timestamps (~ 1 minute worth of data, and 8 peers, is executed in about 1-2ms, with virtually all time spent on building the peer table.. compared to the other handlings that are preformed alongside this in the connection manager, such as establishing new connections, this pales in comparison.
As a side-product of implementing perigee, i had to categorize outbound peers accordingly, this means that each outgoing connection to a new peer now holds an enum, in both the Router, and the Peer struct, according to its expected connection and routing behavior:

#[derive(Copy, Debug, Clone)]
pub enum PeerOutboundType {
    Perigee,
    RandomGraph,
    /// this is a user-specifed persistent connection, established either via command line `--connectpeer`, or the add_peer 
    UserSupplied

One caveat here is that I needed to categorize user-specified peers, these do not technically fall under either the random graph, or perigee, peer routing mechanism, and depending upon if they are specified as is_permanent, or not, display different behavior regarding persistence. currently when these peers are added to the node they occupy beyond the outbound limit, but when another connection is lost, replace the lost connection from the outbound limit's pov. As I was unsure how to handle these connections without altering or adding a bunch of new args, these peers will now not count towards the outbound limit. This may not be the desired handling, but is a change in this pr currently.

I also did a re-work on the connection handler, this was mostly due to a bunch of inconsistencies and performance considerations, that only really take effect when specifying hundreds of outbound peers, as some mining nodes might specify. Handling now works in a congruent fashion, if handling take longer then the tick-timer, the handling is also immediately re-intiated, not delayed further. Also adding peers no longer triggers handling of outbound, and inbound peers, which seems erroneous. Also an extensive refactor was necessary to incorporate the perigee manager.
Regarding storage, first seen timestamps, as well as the set of verified blocks per perigee round are kept in the perigee manager.. the values of individual perigee participants are held within the router. these are stored in raw hash-maps / sets, not in any kind of cached stores. If a malicious peer where to flood the node, with erroneous BlockInvRelayMessages, these would be saved, and cleared after the next perigee round evaluates.. I guess security guarantee here is based on the fact that we expect erroneous BlockInvRelayMessages to quickly result in a protocol error, and expulsion, for that peer, and clean-up happens after the next round. In case we decide this is not enough, to strengthen security from such attacks, we could also implement a max cache size for timestamp hash-maps, or consider doing perigee clean up operations, right after a relevant protocol error is encountered. It is worth noting that erroneous timestamps cannot be falsely evaluated, as all evaluation is done only on blocks that have been consensus verified within the round.
Some thoughts on actual adoption, in essence perigee routing only benefits those nodes that actually require low latency block arrivals, I believe in the case of Kaspa, these are predominantly miners. although the network as a whole would benefit if the whole network adopts perigee, it may be more advantageous if this routing is only advertised to miners, or potentially to make it available only when --enable-mainnet-mining is set, or cheat it in as the default with that flag. Alternatively I believe only running a few perigee nodes in the wild could act as a poor-man's fast relay network, acting as a relatively direct and implicit relay between miners, thereby benefiting the network as a whole. Also having a lot of perigee nodes might put more pressure on the inbound limit on well-connected nodes, as such it might be worth while revisiting: inbound eviction logic / policy. #431 first.
Some tests were added, and best effort given to remove all bugs i could find while testing.
Probably this comment is none-comprehensive.

… perigee

…en connecting to new peers.

…sitivity to tail-end noise.

…on "insufficient" data, and exploration within ibd, also allow for optional persistence beyond restarts.

…o perigee

… daemon code, and perigee manager logs.

… perigee-dev

… start implementing tests.

…gee manager to have hub access.

Perigee dev -> perigee

D-Stacks · 2026-01-20T20:14:38Z

This will probably be my last update, until i get further reviews and inputs, and I'll also open it up for review.

What’s Changed (perigee-dev → perigee)

Breaking Change: Address DB Update

The address database format has changed.
You must start your node with --blk-perigee-reset after upgrading.
This is required to avoid issues with address serialization in the persistence store.

User-Facing Changes (CLI/Args)

All user facing arguments have been renamed, and some changed:
- --blk-perigee-peers sets the total number of block perigee peers (must be ≤ outpeers).
- --blk-perigee-exploration controls how many perigee peers are dropped each round (defaults to 25% of the target).[this argument now takes absolute integer values, not percentiles (default 0 is still a percentile though)]
- --blk-perigee-leverage sets how many perigee peers are leveraged per round (defaults to 50% of the target). [this argument now takes absolute integer values, not percentiles (default 0, is still a percentile though)]
- --blk-perigee-duration lets you tune the perigee round duration in seconds. [this argument now takes absolute integer values corresponding to seconds, not frequency value, it clamps between 30 and 300, and rounds to the nearest 30 sec interval]
- --blk-perigee-stats enables detailed perigee peer statistics.
- --blk-perigee-persist enables persistence of perigee peer selection.
- --blk-perigee-reset resets the perigee peer database (required for this upgrade). [provides a way to clear persisted perigee peers in the db, in case one wants to re-start a persisted search from scratch]
Help messages and validation for these arguments are improved.
Peer management logging is clearer, more condensed, and more informative.

Other Notable Changes

The address and perigee managers have been refactored for better performance and reliability.
Tests were added.
Minor bug fixes.
Refactors.
Added more insightful comments.
Peer selection algo changes: it now re-loops before becoming exhausted, as such setting larger amounts of perigee peers ~10-12+ will no longer exhausts the algo without finding a way to make advances. This also removes the need of the algo to dig as deep on tail-end tie-breaking.

If you’re upgrading, don’t forget:
Run with --blk-perigee-reset the first time (in case you had --perigee-persistance running before this).

… perigee

D-Stacks added 9 commits December 23, 2025 17:36

poc

df41a8b

add stats, fix bugs.

6adf26b

Merge branch 'master' of https://github.com/kaspanet/rusty-kaspa into…

2bb60d2

… perigee

handle rounds more consistently. force an outbound connection type wh…

e6bb9ab

…en connecting to new peers.

fix round frequency.

84985d6

fix outbound handling include Other peers.

384c3ff

fix round activation, excess trimming, and stat display.

53dcb93

clean up.

5dafd50

Merge branch 'master' into perigee

16e22d1

D-Stacks marked this pull request as draft December 25, 2025 23:01

D-Stacks added 20 commits December 26, 2025 00:09

fix

e2b26af

fix --pergiee-round-frequency flag.

236cb56

fix algo to adopt minimization instead of masking.. removing over-sen…

ccfbb00

…sitivity to tail-end noise.

tie break beyond p90.. fix assertion error on start up.

da7fe48

fix bug for sync from scratch.

ba5a0f9

Make perigee sensitive to network conditions, i.e. skip exploitation …

8228ece

…on "insufficient" data, and exploration within ibd, also allow for optional persistence beyond restarts.

fmt / clippy.

c9b015c

fmt again.

7e8df2c

Merge branch 'master' into perigee

6cf1430

Merge branch 'master' into perigee

e0fc2f1

Merge branch 'master' into perigee

6cf6b18

Merge branch 'master' into perigee

17348a2

spawn tasks, remove unneeded receiver in loop.

b85e944

Merge branch 'perigee' of https://github.com/D-Stacks/rusty-kaspa int…

e8c2140

…o perigee

Merge branch 'master' into perigee

824f937

more clean-up and refractor.

54ef021

rename "exploit" to "leverage" (avoid negative connotations).

8c9809e

rename "exploit" to "leverage" (avoid negative connotations).

1c458b6

keep a tighter on cm and p2p adapter synchronization.

a35599c

keep a tighter grip on cm and p2p adapter synchronization.

1b12847

D-Stacks added 23 commits January 13, 2026 12:04

clean-up.

0409e75

change args to absolute specifications, instead of ratios.. refractor…

ab56629

… daemon code, and perigee manager logs.

Merge branch 'master' of https://github.com/kaspanet/rusty-kaspa into…

ab82e4f

… perigee-dev

fix log parsing.

40d1cf7

fix address store.

2bbfffc

fix address store 2.

f44a02e

Merge branch 'master' into perigee

5856990

fix address store 3.

1b2ea04

apply proper endianness.

41a5569

Merge branch 'master' into perigee

2db3c1a

fmt

c16f8c3

rename args to blk perigee (as a future may have tx_perigee as well).

62e1adc

rename args to blk perigee (as a future may have tx_perigee as well).…

f69f531

… start implementing tests.

change logs, extract data from peer struct, no longer a need for peri…

97ecac3

…gee manager to have hub access.

add main eval test to perigee.

a7ea16f

add more tests.

4fd4cdc

add some header comments.

3ddeee6

add a way to reset the perigee store, and add a test for it.

479169e

Merge branch 'master' into perigee

479f15a

merge master.

dfa6371

merge master.

412829c

Fix formatting for kaspa-perigeemanager dependency

5a957c0

Merge pull request #13 from D-Stacks/perigee-dev

1703a3a

Perigee dev -> perigee

D-Stacks marked this pull request as ready for review January 20, 2026 20:33

D-Stacks added 5 commits February 6, 2026 14:14

Merge branch 'master' of https://github.com/kaspanet/rusty-kaspa into…

362dcc8

… perigee

Merge branch 'master' of https://github.com/kaspanet/rusty-kaspa into…

a75da62

… perigee

Merge branch 'master' into perigee

6005e47

Merge branch 'master' into perigee

2ca370c

Merge branch 'master' into perigee

7803b4d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P2P / Networking — Implementation of Perigee p2p Routing. #796

P2P / Networking — Implementation of Perigee p2p Routing. #796
D-Stacks wants to merge 74 commits intokaspanet:masterfrom
D-Stacks:perigee

D-Stacks commented Dec 25, 2025 •

edited

Loading

Uh oh!

D-Stacks commented Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

D-Stacks commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

D-Stacks commented Jan 20, 2026

What’s Changed (perigee-dev → perigee)

Breaking Change: Address DB Update

User-Facing Changes (CLI/Args)

Other Notable Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

D-Stacks commented Dec 25, 2025 •

edited

Loading