Skip to content

Add ingress_filter #5934

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

nyetwurk
Copy link

@nyetwurk nyetwurk commented Apr 23, 2025

Problem

High volume of gossip traffic can lead to unnecessary processing of irrelevant messages (e.g., votes on RPC nodes), consuming resources.

Summary of Changes

Introduces a configurable ingress filter (ClusterInfo::set_ingress_filter) applied early within process_packets. Initially enabled only for PushMessage content, this filter drops unwanted CrdsData before address verification to reduce processing load. Adds ingress_filtered_count metric.

Motivation details here: https://github.com/Blockdaemon/agave-snapshot-gossip-client/blob/main/docs/light-gossip-mode-proposal.md

@mergify mergify bot requested a review from a team April 23, 2025 08:00
@nyetwurk
Copy link
Author

Do not merge yet. Totally untested.

@alexpyattaev
Copy link

Do you have benchmark data to demonstrate this actually helps with real mainnet traffic?

@alexpyattaev alexpyattaev requested a review from gregcusack April 23, 2025 08:13
@nyetwurk
Copy link
Author

Not yet, all in testing currently. Results are not conclusive.

@nyetwurk
Copy link
Author

Opened the PR here to spur discussion on implementation, and if folks feel it is likely to be helpful at all, and most importantly, if it has any potential to get accepted.

@alexpyattaev
Copy link

I suggest you first check the % of messages that can get discarded by this filter to get an idea of how beneficial this might be in terms of % reduction on sigverify load. There is good chance we adopt such a filter if it is actually effective at reducing the load.

@nyetwurk
Copy link
Author

Yes, part of this effort will be to include other metrics in our test environment (not necessarily in this PR) to provide conclusive evidence of actual benefit. As I said, thus far this effort has had inconclusive results. Once we have more data, we will provide it. If we don't see benefit, we'd love to hear theories as to why, and what can be done to actually make it do something useful :)

@alexpyattaev
Copy link

There are quite a few things we can work on, are you on Solana discord? This thread will get very long very fast =)

Copy link

@gregcusack gregcusack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh i don't think the ingress filter is the best tool for solana. it's basically just a leech. It consumes but doesn't forward. if this is adopted widely then gossip propagation paths will collapse.

If you are suggesting to add ingress filtering for actual bad gossip traffic (e.g. traffic with the wrong shred version or malformed traffic), then I am all for it. But i currently do not support filtering out EpochSlots or Votes for example just because your node doesn't need them. Having to consume and forward all gossip traffic, even the stuff you don't want is definitely a flaw of the current gossip protocol no doubt. But until we change it to a subscription method of sorts (if we go that way), we kind of need nodes to receive and forward traffic even if they don't need all of it.

to clarify, the reason you need to participate in gossip is because you need to advertise snapshot hashes right? So validators are able to find the snapshot hashes that you host?

@nyetwurk
Copy link
Author

Closing as #5977 is significantly smaller and accomplishes the proximate goal.

@nyetwurk nyetwurk closed this Apr 24, 2025
@nyetwurk
Copy link
Author

nyetwurk commented Apr 24, 2025

Findings

  1. Selectively filtering ingress did not change the memory foot print
  2. Additional ingress filtering did not decrease CPU usage - most of the cycles were wasted on verifying packets that were going to be dropped anyway (even without the ingress filter)
  3. With a deferred verify, the bottleneck became the OS network layer itself - discarding additional ingress frames (before or after verify) had no effect on overall load.

@nyetwurk
Copy link
Author

to clarify, the reason you need to participate in gossip is because you need to advertise snapshot hashes right? So validators are able to find the snapshot hashes that you host?

Correct, that is the purpose of https://github.com/Blockdaemon/agave-snapshot-gossip-client.

But experimentally, additional filtering had no effect on performance or load. In short, there is no benefit, even though in theory it is ingesting packets it doesn't need if it doesn't filter them.

@nyetwurk nyetwurk deleted the gossip-optimization-master branch April 25, 2025 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants