-
Notifications
You must be signed in to change notification settings - Fork 644
Don't broadcast old messages to the feed #3526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
During node synchronization and catch-up, broadcasting all historical messages can flood connected feed clients with redundant data. This change introduces a check before broadcasting that only messages that are newer than the network's head minus the last two batches for backlog maintenance. The new behavior is to only broadcast messages within the last 2 batches when not synced. New sequencer messages are always broadcast regardless of sync state (this is the sequencer feed, so messages produced by the sequencer must be in it). This feature is enabled by default so operators don't need to make any config changes to start using it. It has hot-reloadable config flag that can be used to disable it: ``` --node.transaction-streamer.disable-broadcast-during-sync ``` Summary of new broadcasting behavior: - New sequencer messages: always broadcast - When synced: broadcast everything normally - When syncing: only broadcast last 2 batches - Fail-safe: any error condition defaults to broadcasting
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #3526 +/- ##
=======================================
Coverage 22.71% 22.71%
=======================================
Files 387 388 +1
Lines 58652 58683 +31
=======================================
+ Hits 13322 13332 +10
- Misses 43296 43312 +16
- Partials 2034 2039 +5 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 🙂
) (*TransactionStreamer, error) { | ||
transactionStreamerConfigFetcher := func() *TransactionStreamerConfig { return &configFetcher.Get().TransactionStreamer } | ||
txStreamer, err := NewTransactionStreamer(ctx, arbDb, l2Config, exec, broadcastServer, fatalErrChan, transactionStreamerConfigFetcher, &configFetcher.Get().SnapSyncTest) | ||
txStreamer, err := NewTransactionStreamer(ctx, arbDb, l2Config, exec, broadcastServer, syncMonitor, fatalErrChan, transactionStreamerConfigFetcher, &configFetcher.Get().SnapSyncTest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SyncMonitor depends on TransactionStreamer today, and this creates a circular dependency between them 😬.
This indicates to me that they could be better placed in a single component.
How about moving SyncMonitor logic to TransactionStreamer and then dropping SyncMonitor?
I created this linear task in order to move logic from SyncMonitor to ConsensusExecutionSyncer, but it is OK to move it TransactionStreamer 🙂.
ConsensusExecutionSyncer already depends on TransactionStreamer, and can call it for what it needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a PR to remove the circular dependency: #3538. If you like the approach then I'll rebase this PR ono that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am re-reading your comment now and I think I got a bit carried away when I saw NIT-3649 and the work I did on #3538 isn't really related to the circular dependency you're talking about here, rather it's related to breaking the consensus<->execution circular dependency.
I've spent some time thinking about how to break the circular dependency between TransactionStreamer and arbnode.SyncMonitor, and TransactionStreamer and BroadcastSyncChecker and I can't see a good way to do it without moving all the logic into TransactionStreamer, which I don't really like because SyncMonitor already has a fairly clear responsibility. I am thinking about alternatives now...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addressed this in 65c6435
Remove TransactionStreamer<->SyncMonitor circ dep
TransactionStreamer now pushes updates to the SyncMonitor of the
currentMessageCount and feedPendingMessageCount every time it updates
either of these in its own state. This means SyncMonitor doesn't need to
have a reference to TransactionStreamer.
The logic for checking if the TransactionStreamer should broadcast to
the feed has been moved into TransactionStreamer, with the core decision
logic split out for testing.
arbnode/transaction_streamer.go
Outdated
s.delayedBridge = delayedBridge | ||
|
||
if s.syncMonitor != nil && inboxReader != nil && inboxReader.tracker != nil { | ||
s.broadcastChecker = NewBroadcastSyncChecker(s.syncMonitor, inboxReader.tracker) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also introducing a circular dependency:
TransactionStreamer -> BroadcastSyncChecker -> SyncMonitor -> TransactionStreamer
This also indicates to me that all of them are better placed in a single component.
How about moving BroadcastSyncChecker logic to TransactionStreamer too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed above.
arbnode/transaction_streamer.go
Outdated
s.inboxReader = inboxReader | ||
s.delayedBridge = delayedBridge | ||
|
||
if s.syncMonitor != nil && inboxReader != nil && inboxReader.tracker != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.inboxReader and s.inboxReader.tracker nil checks here are redundant, they will never be nil here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the separate BroadcastSyncChecker.
arbnode/broadcast_sync_checker.go
Outdated
} | ||
|
||
// ShouldBroadcast determines if messages should be broadcast based on sync state and message position | ||
func (b *BroadcastSyncChecker) ShouldBroadcast(firstMsgIdx arbutil.MessageIndex, msgCount int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding more information, through comments, of what motivated ShouldBroadcast's strategy, e.g., writing a comment here based on the PR description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more comments.
{ | ||
name: "batch info error - always broadcast", | ||
synced: false, | ||
msgCount: 5, | ||
batchInfoError: true, | ||
expected: true, | ||
description: "When batch info provider has errors, fail open and broadcast", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShouldBroadcast could be returning true due to batchCount being equal to zero too.
How about using the same parameters used by "not synced - last message at threshold - broadcast" case, but setting batchInfoError to true instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test for this.
} | ||
} | ||
|
||
func TestBroadcastSyncCheckerEdgeCases(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: why not defining those threshold tests in TestBroadcastSyncChecker test cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved them into the main test
Sorry for the assignee spam, was looking at the wrong PR |
47b32a4
to
9baa2ad
Compare
TransactionStreamer now pushes updates to the SyncMonitor of the currentMessageCount and feedPendingMessageCount every time it updates either of these in its own state. This means SyncMonitor doesn't need to have a reference to TransactionStreamer. The logic for checking if the TransactionStreamer should broadcast to the feed has been moved into TransactionStreamer, with the core decision logic split out for testing.
❌ 24 Tests Failed:
View the top 3 failed tests by shortest run time
📣 Thoughts on this report? Let Codecov know! | Powered by Codecov |
Fixes NIT-3702
During node synchronization and catch-up, broadcasting all historical messages can flood connected feed clients with redundant data. This change introduces a check before broadcasting that only messages that are newer than the network's head minus the last two batches for backlog maintenance.
The new behavior is to only broadcast messages within the last 2 batches when not synced. New sequencer messages are always broadcast regardless of sync state (this is the sequencer feed, so messages produced by the sequencer must be in it).
This feature is enabled by default so operators don't need to make any config changes to start using it. It has hot-reloadable config flag that can be used to disable it:
Summary of new broadcasting behavior: