Improve BFT resilience of BlockBuffer stores#8196
Conversation
this should allow removing some type overhead in the BlockBuffer types
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
remove some ai-generated tests and adjust implementation and docs
it already exists in the compliance engine as well, but including in mempool provides stronger localized guarantee of behaviour
…8170-ln-perm-block-buffer
📝 WalkthroughWalkthroughReplace the legacy pending-block cache with a generic, forest-backed PendingBlocks that enforces view-range and returns errors; make proposals generic; surface buffer and prune errors in compliance engines; move proposal validation earlier and add nuanced handling for BeyondActiveRange and buffer errors. Changes
Sequence Diagram(s)sequenceDiagram
participant Node
participant Core as ComplianceCore
participant Validator
participant Pending as PendingBlocks
participant Hot as HotStuff
participant VA as VoteAggregator
participant VN as ViolationNotifier
Node->>Core: OnBlockProposal(signedProposal)
Core->>Validator: ValidateProposal(signedProposal)
alt validation fails
Validator-->>Core: error
Core->>VN: notify violation
Core->>VA: notify invalid proposal
Core-->>Node: drop / ack (no buffer)
else validation succeeds
Validator-->>Core: valid
Core->>Pending: Add(proposal)
alt BeyondActiveRangeError
Pending-->>Core: BeyondActiveRangeError
Core->>Node: log & ignore (no buffer)
else buffer error
Pending-->>Core: error
Core-->>Node: return processing error
else added
Pending-->>Core: ok
Core->>Hot: forward proposal
Hot-->>VA: update aggregator
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used📓 Path-based instructions (2)**/*.go📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)
Files:
{module,engine,cmd}/**/*.go📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧬 Code graph analysis (2)module/mock/pending_block_buffer.go (4)
module/mock/pending_cluster_block_buffer.go (4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
🔇 Additional comments (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
module/mempool/errors.go (1)
67-71: Consider naming consistency with existing constructors.The existing constructors in this file use the
fsuffix (NewUnknownExecutionResultErrorf,NewBelowPrunedThresholdErrorf) to indicate they accept format arguments. For consistency, consider renaming toNewBeyondActiveRangeErrorf.🔎 Proposed fix
-func NewBeyondActiveRangeError(msg string, args ...interface{}) error { +func NewBeyondActiveRangeErrorf(msg string, args ...interface{}) error { return BeyondActiveRangeError{ err: fmt.Errorf(msg, args...), } }module/buffer/pending_blocks.go (1)
55-60: Incomplete TODO requires follow-up.The documentation explains how
ActiveViewRangeSizelimits tree depth but notes that tree width (conflicting blocks from byzantine proposers) is not bounded. The TODO is left incomplete. Consider tracking this in an issue if it's outside the scope of this PR.Would you like me to open an issue to track the implementation of width-limiting for the pending blocks buffer?
module/buffer/pending_blocks_test.go (2)
231-273: Consider making the pruning test more deterministic.The test uses
rand.Intnto select blocks and the prune view, which could theoretically lead to non-deterministic test behavior. While it's unlikely to cause flakiness since any random selection should exercise the pruning logic correctly, consider using a fixed seed or deterministic selection for reproducibility.Additionally, lines 247-252 may skip loop iterations due to the
i%2 == 1condition combined with the earlycontinueat line 243, resulting in fewer than N blocks being added. This is fine for the test's purpose but worth noting.
358-381: Concurrent test only covers Add operations.The test verifies concurrent
Addoperations are safe, which is valuable. However, consider also testing concurrent reads (ByID,ByParentID) mixed with writes to fully exercise the concurrency guarantees documented in the interface ("Safe for concurrent use").
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
cmd/consensus/main.goconsensus/integration/nodes_test.goengine/collection/compliance/core.goengine/collection/compliance/core_test.goengine/collection/compliance/engine.goengine/collection/epochmgr/factories/compliance.goengine/consensus/compliance/core.goengine/consensus/compliance/core_test.goengine/consensus/compliance/engine.gomodel/cluster/block.gomodel/flow/block.gomodule/buffer.gomodule/buffer/backend.gomodule/buffer/backend_test.gomodule/buffer/pending_blocks.gomodule/buffer/pending_blocks_test.gomodule/buffer/pending_cluster_blocks.gomodule/mempool/errors.gomodule/metrics/example/verification/main.gomodule/mock/buffered_proposal.gomodule/mock/generic_pending_block_buffer.gomodule/mock/pending_block_buffer.gomodule/mock/pending_cluster_block_buffer.go
💤 Files with no reviewable changes (3)
- module/buffer/backend.go
- module/buffer/pending_cluster_blocks.go
- module/buffer/backend_test.go
🧰 Additional context used
🧬 Code graph analysis (14)
engine/collection/epochmgr/factories/compliance.go (2)
module/buffer/pending_blocks.go (1)
NewPendingClusterBlocks(84-90)fvm/evm/offchain/query/view.go (1)
View(23-31)
engine/collection/compliance/core_test.go (1)
module/buffer/pending_blocks.go (1)
NewPendingClusterBlocks(84-90)
module/mempool/errors.go (1)
fvm/errors/errors.go (1)
As(45-47)
module/mock/generic_pending_block_buffer.go (3)
model/flow/block.go (2)
HashablePayload(14-16)GenericProposal(168-173)module/buffer.go (1)
GenericPendingBlockBuffer(14-36)utils/unittest/fixtures/identifier.go (1)
Identifier(8-8)
module/buffer.go (2)
model/flow/block.go (2)
HashablePayload(14-16)GenericProposal(168-173)module/mock/generic_pending_block_buffer.go (1)
GenericPendingBlockBuffer(11-13)
module/buffer/pending_blocks_test.go (9)
module/buffer/pending_blocks.go (2)
GenericPendingBlocks(63-67)NewPendingBlocks(75-82)utils/unittest/fixtures/payload.go (1)
Payload(8-8)utils/unittest/block.go (1)
BlockFixture(14-21)utils/unittest/fixtures.go (5)
AsSlashable(309-315)ProposalFromBlock(282-287)BlockWithParentFixture(398-400)BlockHeaderFixture(580-600)IdentifierFixture(1151-1155)utils/unittest/fixtures/header.go (1)
Header(10-10)utils/unittest/fixtures/util.go (2)
NoError(6-10)Assert(13-17)engine/enqueue.go (1)
Message(12-15)module/mempool/errors.go (2)
IsBeyondActiveRangeError(82-85)IsBelowPrunedThresholdError(55-58)utils/unittest/fixtures/identifier.go (1)
Identifier(8-8)
module/mock/pending_block_buffer.go (4)
module/buffer.go (1)
PendingBlockBuffer(39-39)model/flow/block.go (1)
GenericProposal(168-173)utils/unittest/fixtures/payload.go (1)
Payload(8-8)utils/unittest/fixtures/identifier.go (1)
Identifier(8-8)
engine/consensus/compliance/core_test.go (1)
module/buffer/pending_blocks.go (1)
NewPendingBlocks(75-82)
cmd/consensus/main.go (2)
module/buffer/pending_blocks.go (1)
NewPendingBlocks(75-82)fvm/evm/offchain/query/view.go (1)
View(23-31)
model/cluster/block.go (1)
model/flow/block.go (1)
GenericProposal(168-173)
module/mock/pending_cluster_block_buffer.go (4)
module/buffer.go (1)
PendingClusterBlockBuffer(42-42)model/flow/block.go (1)
GenericProposal(168-173)utils/unittest/fixtures/payload.go (1)
Payload(8-8)utils/unittest/fixtures/identifier.go (1)
Identifier(8-8)
consensus/integration/nodes_test.go (1)
module/buffer/pending_blocks.go (1)
NewPendingBlocks(75-82)
module/metrics/example/verification/main.go (1)
module/buffer/pending_blocks.go (1)
NewPendingBlocks(75-82)
engine/collection/compliance/core.go (1)
module/metrics/labels.go (1)
ResourceClusterProposal(78-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Lint (./integration/)
- GitHub Check: Lint (./)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (network/p2p)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (state)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
🔇 Additional comments (36)
module/mempool/errors.go (1)
59-85: LGTM!The
BeyondActiveRangeErrorimplementation follows the established pattern in this file and correctly implements error chaining withUnwrap(). The predicate function useserrors.Asappropriately for error type detection.module/mock/buffered_proposal.go (1)
1-47: LGTM!Auto-generated mock file following standard mockery conventions.
model/flow/block.go (2)
166-178: LGTM!The introduction of
GenericProposal[T HashablePayload]withProposalas a type alias is a clean approach for consolidating proposal types while maintaining backward compatibility. The struct correctly contains the block and proposer signature data.
230-234: LGTM!The
ProposalHeadermethod correctly operates on the generic type, enabling it to work with any payload type that satisfiesHashablePayload.model/cluster/block.go (2)
82-85: LGTM!The cluster
Proposaltype alias toflow.GenericProposal[Payload]correctly leverages the generic infrastructure while maintaining cluster-specific payload handling.
99-135: LGTM!The constructors correctly validate cluster-specific blocks via
NewBlock/NewRootBlockand enforce appropriate signature constraints (non-empty for proposals, empty for root proposals).module/buffer/pending_blocks.go (4)
13-41: LGTM!The
proposalVertexadapter correctly implements theforest.Vertexinterface, caching the block ID for efficient lookups and mapping view to level.
92-114: LGTM!The
Addmethod correctly validates the block view against the active range and returns an appropriateBeyondActiveRangeError. The check correctly handles theactiveViewRangeSize == 0case as unlimited.
116-146: LGTM!The
ByIDandByParentIDmethods correctly retrieve proposals from the forest with proper locking. The type assertion fromforest.VertextoproposalVertex[T]is safe given the controlled insertion viaAdd.
148-170: LGTM!The
PruneByViewmethod correctly adjusts for the exclusive nature ofPruneUpToLevel. ThehighestPrunedViewhelper correctly inverts the offset applied during construction.engine/consensus/compliance/engine.go (1)
183-186: LGTM: Error propagation properly added.The change from fire-and-forget to error-aware finalized block processing is a critical reliability improvement. Errors that were previously silently ignored will now be surfaced and handled appropriately.
consensus/integration/nodes_test.go (1)
520-521: LGTM: Parameterized cache initialization is correct.The pending blocks cache is now properly initialized with the finalized view and skip threshold, aligning with the new generic buffer API that enforces view-range bounds.
engine/collection/epochmgr/factories/compliance.go (1)
69-73: LGTM: Proper initialization with error handling.The cache is correctly initialized with the finalized view and skip threshold from configuration. The error handling for the finalized header retrieval ensures failures are properly propagated.
engine/collection/compliance/engine.go (1)
169-171: LGTM: Error propagation properly added.Consistent with the consensus compliance engine, this change ensures that finalized block processing errors are properly surfaced rather than silently ignored.
engine/collection/compliance/core_test.go (2)
127-127: LGTM: Mock expectations updated correctly.The mock setup properly reflects the new API:
Addno longer requires a boolean return value configuration (now returns error)PruneByViewcorrectly returns nil errorAlso applies to: 147-147
545-545: LGTM: Real buffer instantiation is correct.The test correctly instantiates the real buffer with the finalized view from the test head and an appropriate active range size for testing.
cmd/consensus/main.go (1)
808-808: LGTM: Production buffer initialization is correct.The pending blocks cache is properly initialized with:
- The last finalized header's view to establish the baseline
- The skip threshold from compliance configuration to bound the active view range
This ensures the buffer enforces view-range constraints during block proposal processing.
engine/consensus/compliance/core_test.go (2)
203-203: LGTM: Mock expectations updated correctly.The mock setup properly reflects the new API, consistent with the collection compliance tests:
Addno longer requires a boolean return value configurationPruneByViewcorrectly returns nil errorAlso applies to: 223-223
587-587: LGTM: Real buffer instantiation is correct.The test correctly instantiates the real buffer with the test head's view and an appropriate active range size for testing.
engine/collection/compliance/core.go (4)
96-99: LGTM!Good addition of error handling for
ProcessFinalizedBlockduring initialization. This ensures that if pruning fails, the error is properly propagated rather than silently ignored.
199-207: LGTM!Appropriate handling of
BeyondActiveRangeError- logging and returning nil is the correct behavior since blocks beyond the active range should be silently dropped without causing failures. The comment explaining the threshold flexibility is helpful.
221-229: LGTM!Consistent error handling pattern with the previous
Addcall site. TheBeyondActiveRangeErroris correctly treated as a non-fatal condition.
382-394: LGTM!The signature change from void to error-returning is appropriate. Propagating errors from
PruneByViewensures that any unexpected errors during pruning are surfaced to callers rather than silently ignored.engine/consensus/compliance/core.go (3)
103-106: LGTM!Consistent error handling for
ProcessFinalizedBlockduring initialization, matching the collection compliance engine.
248-265: Improved logic for parent request handling.The restructured logic now:
- Adds the block to pending buffer first
- Only requests the parent if it's not already in the pending buffer
This reduces duplicate parent requests since if the parent is pending, we've already requested it or will process it soon. The order change is correct.
436-448: LGTM!Consistent with the collection compliance engine changes - error propagation from
PruneByViewis now properly handled.module/buffer.go (2)
8-36: Well-designed generic interface.The generic
GenericPendingBlockBuffer[T flow.HashablePayload]interface is cleanly designed with:
- Clear documentation for error conditions
- Proper use of type constraints
- Consistent method signatures
The error documentation for
AddandPruneByViewis helpful for callers.
38-42: LGTM!The type aliases
PendingBlockBufferandPendingClusterBlockBuffermaintain backward compatibility while leveraging the new generic implementation.module/buffer/pending_blocks_test.go (5)
1-13: LGTM!Good test setup with appropriate imports. The test file follows standard Go testing patterns using testify/suite.
42-133: Good coverage of Add behavior and edge cases.Tests comprehensively cover:
- Basic add and retrieval
- Duplicate handling
- Below finalized view (no-op)
- Active view range boundary enforcement
- Zero active range (no limit)
This aligns well with the interface contract.
156-229: Thorough ByID and ByParentID testing.Good coverage of:
- Existing vs non-existent block lookup
- Multiple children per parent
- Direct children only (not grandchildren)
275-337: Good multi-step pruning tests.The tests verify:
- Error handling for pruning below finalized view
- Sequential pruning behavior
- Parent-child relationship integrity after partial pruning
383-421: LGTM!Good edge case coverage for empty buffer operations and add-after-prune behavior.
module/mock/generic_pending_block_buffer.go (1)
1-139: Auto-generated mock - no review needed.This file is generated by mockery and correctly implements the
GenericPendingBlockBuffer[T]interface. The generic type parameter support and standard mock patterns look correct.module/mock/pending_block_buffer.go (1)
1-139: Auto-generated mock correctly reflects interface changes.This file is auto-generated by mockery and correctly implements the updated
PendingBlockBufferinterface with:
- Generic proposal types (
flow.GenericProposal[flow.Payload])- Error-returning
AddandPruneByViewmethods- Removal of
DropForParent(as noted in PR objectives)The mock implementation properly handles error return values and type assertions for the new generic types.
module/mock/pending_cluster_block_buffer.go (1)
1-141: Auto-generated mock correctly reflects interface changes.This file is auto-generated by mockery and correctly implements the updated
PendingClusterBlockBufferinterface with:
- Generic proposal types (
flow.GenericProposal[cluster.Payload])- Error-returning
AddandPruneByViewmethods- Removal of
DropForParentThe implementation is consistent with the parallel mock in
pending_block_buffer.go, using the appropriatecluster.Payloadtype parameter.
|
|
||
| // creates consensus cache for follower engine, and registers size method of backend for metrics | ||
| pendingBlocks := buffer.NewPendingBlocks() | ||
| pendingBlocks := buffer.NewPendingBlocks(0, 100_000) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, check if the file exists and examine lines around 111 and 176
cat -n module/metrics/example/verification/main.go | sed -n '105,120p'
echo "---"
cat -n module/metrics/example/verification/main.go | sed -n '170,185p'Repository: onflow/flow-go
Length of output: 1348
🏁 Script executed:
#!/bin/bash
# Search for NewPendingBlocks usage in the codebase to understand typical patterns
echo "=== NewPendingBlocks usage patterns ==="
rg 'NewPendingBlocks' --type=go -B 2 -A 2
echo ""
echo "=== Error handling for pendingBlocks.Add() ==="
rg '\.Add\(flow\.Slashable' --type=go -B 2 -A 2 | head -50
echo ""
echo "=== Check what buffer.PendingBlocks signature is ==="
fd 'pending.*\.go' | xargs grep -l 'NewPendingBlocks'Repository: onflow/flow-go
Length of output: 5072
🏁 Script executed:
#!/bin/bash
# Examine the PendingBlocks implementation and Add() method
echo "=== PendingBlocks.Add() method signature and implementation ==="
rg -A 20 'func \(pb \*PendingBlocks\) Add' module/buffer/pending_blocks.go
echo ""
echo "=== Check tests to see how Add() errors are handled ==="
rg -B 5 -A 5 '\.Add\(' module/buffer/pending_blocks_test.go | head -60
echo ""
echo "=== Check all production usages of Add() ==="
rg '\.Add\(flow\.Slashable' --type=go -B 3 -A 3 --exclude-dir=testRepository: onflow/flow-go
Length of output: 2519
🏁 Script executed:
#!/bin/bash
# Get the Add() method signature more carefully
echo "=== PendingBlocks.Add() method signature ==="
grep -n "func.*Add.*flow.Slashable" module/buffer/pending_blocks.go
echo ""
sed -n '100,200p' module/buffer/pending_blocks.go | head -60
echo ""
echo "=== All production usages of pendingBlocks.Add (not in test files) ==="
rg 'pendingBlocks\.Add|proposals\.Add' --type=go | grep -v test | grep -v example
echo ""
echo "=== Check how compliance engine uses pending blocks ==="
rg -A 10 'pending\.Add' engine/consensus/compliance/ --type=go | head -40Repository: onflow/flow-go
Length of output: 3419
Handle error from Add() call.
Line 176: The error from pendingBlocks.Add() is currently ignored with _. While this is demo code, the Add() method can return BeyondActiveRangeError when a block exceeds the active view range. Even in probabilistic test scenarios, consider at least logging errors to surface potential issues.
Also applies to: 176-180
🤖 Prompt for AI Agents
module/metrics/example/verification/main.go around lines 111 and 176-180: the
return value from pendingBlocks.Add(...) is ignored — capture the error, check
if err != nil and log it with context (for example using log.Printf or similar
project logger) and optionally handle BeyondActiveRangeError specially if it is
expected (e.g., increment a counter or skip) otherwise surface or return the
error; ensure you do not discard the error silently.
There was a problem hiding this comment.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @module/buffer/pending_blocks.go:
- Around line 43-67: The TODO in the GenericPendingBlocks documentation is
unfinished: clarify how the buffer limits unbounded memory growth from width
attacks or remove the placeholder; update the comment for GenericPendingBlocks
to state whether mitigation beyond activeViewRangeSize exists (e.g., per-parent
child caps, global entry limits, eviction policy, validation filtering, or
relying solely on LevelledForest pruning) and describe the implemented strategy
(referencing activeViewRangeSize and forest/LevelledForest) so readers know how
conflicting-block explosion is handled or that no extra width-limiting mechanism
is present.
🧹 Nitpick comments (2)
module/buffer/pending_blocks.go (2)
117-127: Consider safe type assertion.Line 126 uses an unchecked type assertion
vertex.(proposalVertex[T])that could panic if the vertex type is unexpected. While this should never occur in correct usage, defensive coding would use the comma-ok idiom to handle unexpected types gracefully.🔎 Proposed defensive type assertion
func (b *GenericPendingBlocks[T]) ByID(blockID flow.Identifier) (flow.Slashable[*flow.GenericProposal[T]], bool) { b.lock.Lock() defer b.lock.Unlock() vertex, ok := b.forest.GetVertex(blockID) if !ok { return flow.Slashable[*flow.GenericProposal[T]]{}, false } - return vertex.(proposalVertex[T]).proposal, true + pv, ok := vertex.(proposalVertex[T]) + if !ok { + // This should never happen; indicates a serious bug + return flow.Slashable[*flow.GenericProposal[T]]{}, false + } + return pv.proposal, true }
129-147: Consider safe type assertion in iterator loop.Line 143 uses an unchecked type assertion
vertex.(proposalVertex[T])within the iterator loop. Similar toByID, this could panic if an unexpected vertex type is encountered. Consider using the comma-ok idiom to handle this defensively, or at minimum, add a comment explaining why this assertion is safe.🔎 Proposed defensive type assertion
children := make([]flow.Slashable[*flow.GenericProposal[T]], 0, n) iterator := b.forest.GetChildren(parentID) for iterator.HasNext() { vertex := iterator.NextVertex() - children = append(children, vertex.(proposalVertex[T]).proposal) + pv, ok := vertex.(proposalVertex[T]) + if !ok { + // This should never happen; indicates a serious bug + continue + } + children = append(children, pv.proposal) } return children, true
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
module/buffer/pending_blocks.go
🧰 Additional context used
📓 Path-based instructions (2)
**/*.go
📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)
Follow Go coding conventions as documented in @docs/agents/CodingConventions.md
Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md
**/*.go: Follow the existing module structure in/module/,/engine/,/model/and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: runmake generate-mocksafter interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoidfmt.Errorf, useirrecoverablepackage for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)
Files:
module/buffer/pending_blocks.go
{module,engine,cmd}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
All major processing components must implement the
Componentinterface from/module/component/component.goto ensure consistent lifecycle management and graceful shutdown patterns
Files:
module/buffer/pending_blocks.go
🧬 Code graph analysis (1)
module/buffer/pending_blocks.go (3)
model/flow/block.go (2)
HashablePayload(14-16)GenericProposal(168-173)module/forest/leveled_forest.go (2)
LevelledForest(24-29)NewLevelledForest(51-57)module/mempool/errors.go (1)
NewBeyondActiveRangeError(67-71)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
- GitHub Check: Lint (./integration/)
- GitHub Check: Lint (./)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (network/p2p)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Unit Tests (state)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
🔇 Additional comments (7)
module/buffer/pending_blocks.go (7)
13-26: LGTM: Clean vertex adapter with cached ID.The
proposalVertexwrapper efficiently adapts generic proposals to the forest model, caching the block ID to avoid repeated computation.
69-73: LGTM: Type aliases and interface checks.The type aliases provide a clean API, and the compile-time interface compliance checks ensure type safety.
75-91: LGTM: Constructors now consistent.Both constructors correctly initialize the forest with
finalizedView + 1, and the comments clearly explain thatLevelledForest.LowestLevelis inclusive.
149-157: LGTM: Correct pruning with clear offset documentation.The method correctly adjusts the view parameter (+1) when calling
PruneUpToLevel, and the comment clearly explains the inclusive/exclusive boundary semantics.
159-164: LGTM: Size reporting.The method correctly reports the buffer size with appropriate thread safety.
166-171: LGTM: Helper with clear locking requirement.The method correctly calculates the finalized view, and the CAUTION comment appropriately warns callers about the locking requirement.
93-115: The duplicate handling inLevelledForest.AddVertexcorrectly aligns with the documented behavior inAdd(). The forest'sAddVertexmethod already checks for duplicate vertices at lines 194-195 (if !f.isEmptyContainer(container) { return }) and only increments the size counter for new vertices. The code is working as intended, and no changes are required.
There was a problem hiding this comment.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @engine/consensus/compliance/core_test.go:
- Line 203: The test mock for PendingBlockBuffer is out of date: Update the
cs.pending mock expectation to match the new PendingBlockBuffer.Add signature
(single argument `proposal`) and specify the returned error value; for example
replace the current cs.pending.On("Add", mock.Anything, mock.Anything) with a
single-arg expectation and chain a .Return(...) that returns an error or nil as
appropriate (e.g., .Return(nil)) so the test handles the Add error return
correctly and matches the interface.
♻️ Duplicate comments (1)
engine/consensus/compliance/core.go (1)
156-159: Incomplete documentation sentence.The comment at lines 156-159 ends with an incomplete fragment: "This is particularly dangerous for" without completing the thought. This was previously flagged.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
engine/consensus/compliance/core.goengine/consensus/compliance/core_test.go
🧰 Additional context used
📓 Path-based instructions (5)
**/*.go
📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)
Follow Go coding conventions as documented in @docs/agents/CodingConventions.md
Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md
**/*.go: Follow the existing module structure in/module/,/engine/,/model/and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: runmake generate-mocksafter interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoidfmt.Errorf, useirrecoverablepackage for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)
Files:
engine/consensus/compliance/core.goengine/consensus/compliance/core_test.go
{module,engine,cmd}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
All major processing components must implement the
Componentinterface from/module/component/component.goto ensure consistent lifecycle management and graceful shutdown patterns
Files:
engine/consensus/compliance/core.goengine/consensus/compliance/core_test.go
{crypto,fvm,ledger,access,engine}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Cryptographic operations require careful handling; refer to crypto library documentation for proper implementation
Files:
engine/consensus/compliance/core.goengine/consensus/compliance/core_test.go
{network,engine,consensus}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Network messages must be authenticated and validated
Files:
engine/consensus/compliance/core.goengine/consensus/compliance/core_test.go
**/*_test.go
📄 CodeRabbit inference engine (AGENTS.md)
**/*_test.go: Unit tests should be co-located with the code they test
Follow the existing pattern of*_test.gofiles for test naming
Use fixtures for realistic test data as defined in/utils/unittest/
Files:
engine/consensus/compliance/core_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to **/*.go : All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Applied to files:
engine/consensus/compliance/core.go
🧬 Code graph analysis (1)
engine/consensus/compliance/core_test.go (1)
module/buffer/pending_blocks.go (1)
NewPendingBlocks(78-85)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
- GitHub Check: Lint (./)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Lint (./integration/)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (network/p2p)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests (state)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
🔇 Additional comments (10)
engine/consensus/compliance/core_test.go (4)
223-223: LGTM!The
PruneByViewmock correctly returnsnilto match the new error-returning signature.
586-586: LGTM!The real buffer instantiation correctly uses the new parameterized constructor with
finalizedViewandactiveViewRangeSize, aligning with the updatedNewPendingBlockssignature.
593-606: LGTM!The per-proposal validation expectations correctly reflect the behavioral change where proposal validation now occurs before caching. Each proposal has a properly scoped
.Once()expectation set immediately before processing.
637-642: LGTM!The validation expectation for
missingProposalis correctly added before processing, completing the validation setup for all proposals in the test scenario.engine/consensus/compliance/core.go (6)
103-106: LGTM!Proper error handling for the newly error-returning
ProcessFinalizedBlockduring initialization. This ensures any pruning failures are surfaced early.
235-272: Good security improvement: validation before caching.Moving proposal validation before the caching step is a sound BFT resilience improvement. This prevents invalid proposals from consuming cache space and ensures only validated proposals enter the pending buffer.
The error handling is comprehensive:
- Invalid proposals trigger slashing notifications and VoteAggregator updates
- Unknown epoch errors are gracefully handled (assuming malicious far-future blocks)
- Unexpected errors are properly propagated
287-296: LGTM!The error handling for
pending.Addcorrectly distinguishes betweenBeyondActiveRangeError(expected, logged at debug level) and unexpected errors (propagated with context). The defensive comment about potentially differing thresholds is helpful.
424-426: LGTM!The
hotstuffProposalis appropriately created upfront and used consistently for error wrapping inNewInvalidProposalErrorfand forVoteAggregator.AddBlock.
454-467: LGTM!The refactored
ProcessFinalizedBlockcorrectly:
- Propagates errors from
PruneByView- Updates finalized counters only after successful pruning
- Always records mempool metrics
This aligns with the PR's goal of surfacing buffer errors in compliance engines.
299-303: LGTM!The parent request logic correctly checks if the parent is already cached before issuing a sync request, preventing redundant network requests.
| // set up pending module mock | ||
| cs.pending = &module.PendingBlockBuffer{} | ||
| cs.pending.On("Add", mock.Anything, mock.Anything).Return(true) | ||
| cs.pending.On("Add", mock.Anything, mock.Anything) |
There was a problem hiding this comment.
Missing mock return value for Add method.
The Add method now returns an error according to the new PendingBlockBuffer interface, but the mock expectation does not specify a return value. This could cause unexpected test behavior or panics if the code checks the error.
🔎 Proposed fix
- cs.pending.On("Add", mock.Anything, mock.Anything)
+ cs.pending.On("Add", mock.Anything).Return(nil)Note: The method signature also appears to have changed to accept a single argument (proposal) rather than two arguments. Verify against the interface definition.
🤖 Prompt for AI Agents
In @engine/consensus/compliance/core_test.go around line 203, The test mock for
PendingBlockBuffer is out of date: Update the cs.pending mock expectation to
match the new PendingBlockBuffer.Add signature (single argument `proposal`) and
specify the returned error value; for example replace the current
cs.pending.On("Add", mock.Anything, mock.Anything) with a single-arg expectation
and chain a .Return(...) that returns an error or nil as appropriate (e.g.,
.Return(nil)) so the test handles the Add error return correctly and matches the
interface.
…8170-ln-perm-block-buffer
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
module/buffer/pending_blocks_test.go (1)
38-48: Consider adding a safety check to the view selection loop.The loop at lines 40-45 increments the view until an unused one is found. While unlikely in practice, this could theoretically loop indefinitely if a large number of views are already taken. Consider adding a maximum iteration count as a safeguard.
🔎 Suggested enhancement
func (suite *PendingBlocksSuite) blockWithParent(parent *flow.Header, usedViews map[uint64]struct{}) flow.Slashable[*flow.Proposal] { block := unittest.BlockWithParentFixture(parent) + maxAttempts := 1000 for { + maxAttempts-- + if maxAttempts == 0 { + suite.FailNow("unable to find unused view after 1000 attempts") + } if _, hasForbiddenView := usedViews[block.View]; !hasForbiddenView { break } block.View++ } usedViews[block.View] = struct{}{} // add the block's view to `usedViews` to prevent future re-usage return unittest.AsSlashable(unittest.ProposalFromBlock(block)) }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
module/buffer/pending_blocks.gomodule/buffer/pending_blocks_test.go
🧰 Additional context used
📓 Path-based instructions (3)
**/*.go
📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)
Follow Go coding conventions as documented in @docs/agents/CodingConventions.md
Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md
**/*.go: Follow the existing module structure in/module/,/engine/,/model/and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: runmake generate-mocksafter interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoidfmt.Errorf, useirrecoverablepackage for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)
Files:
module/buffer/pending_blocks.gomodule/buffer/pending_blocks_test.go
{module,engine,cmd}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
All major processing components must implement the
Componentinterface from/module/component/component.goto ensure consistent lifecycle management and graceful shutdown patterns
Files:
module/buffer/pending_blocks.gomodule/buffer/pending_blocks_test.go
**/*_test.go
📄 CodeRabbit inference engine (AGENTS.md)
**/*_test.go: Unit tests should be co-located with the code they test
Follow the existing pattern of*_test.gofiles for test naming
Use fixtures for realistic test data as defined in/utils/unittest/
Files:
module/buffer/pending_blocks_test.go
🧠 Learnings (1)
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to **/*_test.go : Use fixtures for realistic test data as defined in `/utils/unittest/`
Applied to files:
module/buffer/pending_blocks_test.go
🧬 Code graph analysis (2)
module/buffer/pending_blocks.go (3)
model/flow/block.go (2)
HashablePayload(14-16)GenericProposal(168-173)module/forest/leveled_forest.go (2)
LevelledForest(24-29)NewLevelledForest(51-57)module/mempool/errors.go (1)
NewBeyondActiveRangeError(67-71)
module/buffer/pending_blocks_test.go (3)
module/buffer/pending_blocks.go (2)
GenericPendingBlocks(66-70)NewPendingBlocks(78-85)utils/unittest/fixtures.go (4)
AsSlashable(309-315)ProposalFromBlock(282-287)BlockHeaderFixture(580-600)IdentifierFixture(1151-1155)module/mempool/errors.go (2)
IsBeyondActiveRangeError(82-85)IsBelowPrunedThresholdError(55-58)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Lint (./)
- GitHub Check: Lint (./integration/)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (network/p2p)
- GitHub Check: Unit Tests (state)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
🔇 Additional comments (12)
module/buffer/pending_blocks_test.go (5)
24-28: Excellent test setup using the appropriate fixtures.The test suite properly initializes with finalized view 0 and no range limit, allowing individual tests to configure their own constraints. Good use of
unittestfixtures as per coding guidelines.
100-120: Well-designed boundary test for active view range enforcement.This test correctly verifies that blocks exceeding
finalizedView + activeViewRangeSizeare rejected with aBeyondActiveRangeError, and that the buffer remains empty. The use ofmempool.IsBeyondActiveRangeErrorproperly validates the error type.
241-284: Robust pruning test with randomized block structure.The test builds a realistic tree structure with 10% unrelated blocks and 90% children of existing blocks, then verifies correct pruning behavior. The randomized approach provides good coverage of different tree topologies.
369-392: Good concurrent access testing.The test verifies thread-safety by having 10 goroutines concurrently add blocks. The comment correctly notes that correctness depends on
block()not returning duplicates (which it doesn't, thanks tounittest.BlockFixture()generating unique fixtures).
343-343: Themaxbuilt-in function is fully supported. The project requires Go 1.25.0 (as specified in go.mod), which is well above the minimum version (Go 1.21) whenmaxwas introduced. No action needed.module/buffer/pending_blocks.go (7)
13-41: Clean implementation of forest.Vertex interface.The
proposalVertex[T]wrapper correctly implements theforest.Vertexinterface by providingVertexID(),Level(), andParent()methods. The immutable struct directive and caching of the block ID in the constructor are good practices for performance.
43-70: Excellent documentation of memory exhaustion mitigation strategy.The documentation clearly explains how the buffer prevents unbounded memory growth:
activeViewRangeSizelimits depth (view range)- One proposal per view limits width (conflicting blocks at same view)
- Reliance on sync engine for liveness if wrong proposal is cached
This addresses the security concerns around byzantine proposers mounting memory exhaustion attacks.
78-94: Constructors correctly handle LevelledForest's inclusive boundary.Both constructors now consistently use
finalizedView + 1for the forest'slowestLevel, with comments explaining thatLowestLevelis inclusive. This matches the inverse calculation inhighestPrunedView()at line 176 (LowestLevel - 1). The symmetry confirms correct handling of the boundary semantics.
96-122: Add method correctly enforces view range and prevents width attacks.The implementation properly:
- Locks for thread safety
- Checks
activeViewRangeSizeboundary (when > 0) and returns appropriate error- Enforces one-proposal-per-view by checking
GetNumberOfVerticesAtLevel(blockView) == 0- Silently ignores duplicates and below-finalized blocks (no-op behavior as documented)
The boundary check at line 109 correctly accepts blocks at
finalizedView + activeViewRangeSize(inclusive) and rejects beyond.
135-153: ByParentID efficiently retrieves children using forest iterator.The method pre-allocates the slice with the correct capacity (
GetNumberOfChildren) and uses the iterator pattern to collect children. This is efficient and correctly returns(nil, false)when no children exist.
155-163: PruneByView correctly handles forest's exclusive boundary.The comment at line 161 correctly notes that
PruneUpToLevelis exclusive, soview + 1is passed. This ensures blocks at the specified view are pruned (inclusive pruning from caller's perspective). The error from the forest (likelyBelowPrunedThresholdError) is properly propagated.
172-177: Helper method correctly calculates highest pruned view.The calculation
forest.LowestLevel - 1is correct sinceLowestLevelrepresents the lowest unpruned view. The caution comment properly warns that callers must hold the lock, which all callers do.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @module/buffer/pending_blocks_test.go:
- Line 394: The comment has a typo referencing the wrong test suite name; update
the comment string "// NOTE: correctness here depends on
[PendingBlockSuite.block] not returning duplicates." to reference the correct
symbol "[PendingBlocksSuite.block]" so it reads that correctness depends on
[PendingBlocksSuite.block] not returning duplicates; leave the rest of the
comment unchanged.
🧹 Nitpick comments (1)
module/buffer/pending_blocks_test.go (1)
393-416: Consider expanding concurrent access test coverage.While the current test verifies thread-safe concurrent writes, consider testing additional concurrent scenarios such as:
- Concurrent reads (ByID, ByParentID) while adding blocks
- Concurrent Add and PruneByView operations
- Concurrent reads during pruning
These scenarios would provide more confidence in the buffer's thread-safety under realistic usage patterns.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
module/buffer/pending_blocks_test.go
🧰 Additional context used
📓 Path-based instructions (3)
**/*.go
📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)
Follow Go coding conventions as documented in @docs/agents/CodingConventions.md
Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md
**/*.go: Follow the existing module structure in/module/,/engine/,/model/and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: runmake generate-mocksafter interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoidfmt.Errorf, useirrecoverablepackage for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)
Files:
module/buffer/pending_blocks_test.go
**/*_test.go
📄 CodeRabbit inference engine (AGENTS.md)
**/*_test.go: Unit tests should be co-located with the code they test
Follow the existing pattern of*_test.gofiles for test naming
Use fixtures for realistic test data as defined in/utils/unittest/
Files:
module/buffer/pending_blocks_test.go
{module,engine,cmd}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
All major processing components must implement the
Componentinterface from/module/component/component.goto ensure consistent lifecycle management and graceful shutdown patterns
Files:
module/buffer/pending_blocks_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
- GitHub Check: Lint (./integration/)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Lint (./)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (network/p2p)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Unit Tests (state)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
🔇 Additional comments (6)
module/buffer/pending_blocks_test.go (6)
30-48: Well-designed helper methods.The
block()andblockWithParent()helpers are well-structured. TheblockWithParent()method correctly ensures unique views by incrementing until an unused view is found, which is essential since the buffer stores one proposal per view.
51-238: Comprehensive coverage of basic buffer operations.The tests for Add, ByID, ByParentID, and related operations provide thorough coverage of the buffer's core functionality, including edge cases like duplicate blocks and same-view conflicts.
84-186: Thorough testing of active view range enforcement.The test coverage for view range limits is excellent, including edge cases (exactly at limit, no limit, below finalized view) and proper error type assertions.
266-372: Robust pruning test coverage.The pruning tests provide excellent coverage including random tree structures, multiple sequential prunes, and proper error handling for invalid prune operations.
419-457: Good coverage of edge cases.The tests for empty buffer operations and adding after pruning provide valuable coverage of edge cases and ensure the buffer behaves correctly in these scenarios.
367-367: The project uses Go 1.25.0, which fully supports the built-inmax()function. No compatibility issues.
| } | ||
|
|
||
| // TestConcurrentAccess tests that the buffer is safe for concurrent access. | ||
| // NOTE: correctness here depends on [PendingBlockSuite.block] not returning duplicates. |
There was a problem hiding this comment.
Fix typo in comment.
The comment references [PendingBlockSuite.block] but should be [PendingBlocksSuite.block] (note the extra 's' in Blocks).
📝 Proposed fix
-// NOTE: correctness here depends on [PendingBlockSuite.block] not returning duplicates.
+// NOTE: correctness here depends on [PendingBlocksSuite.block] not returning duplicates.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // NOTE: correctness here depends on [PendingBlockSuite.block] not returning duplicates. | |
| // NOTE: correctness here depends on [PendingBlocksSuite.block] not returning duplicates. |
🤖 Prompt for AI Agents
In @module/buffer/pending_blocks_test.go at line 394, The comment has a typo
referencing the wrong test suite name; update the comment string "// NOTE:
correctness here depends on [PendingBlockSuite.block] not returning duplicates."
to reference the correct symbol "[PendingBlocksSuite.block]" so it reads that
correctness depends on [PendingBlocksSuite.block] not returning duplicates;
leave the rest of the comment unchanged.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
engine/consensus/compliance/core.go (2)
247-265: Consider extracting duplicate invalid proposal handling.The error handling for
InvalidProposalErrorat lines 247-265 and 356-374 is nearly identical (notifyingproposalViolationNotifierandvoteAggregator). Consider extracting this into a helper method to improve maintainability.♻️ Example refactoring
+// handleInvalidProposal handles an invalid proposal by notifying consumers and the vote aggregator. +// Returns nil if the proposal was successfully handled, or an error if notifying the vote aggregator failed. +func (c *Core) handleInvalidProposal(proposal flow.Slashable[model.InvalidProposalError], hotstuffProposal *model.SignedProposal, log zerolog.Logger) error { + log.Err(&proposal.Message).Msg("received invalid block from other node (potential slashing evidence?)") + + // notify consumers about invalid block + c.proposalViolationNotifier.OnInvalidBlockDetected(proposal) + + // notify VoteAggregator about the invalid block + err := c.voteAggregator.InvalidBlock(hotstuffProposal) + if err != nil { + if mempool.IsBelowPrunedThresholdError(err) { + log.Warn().Msg("received invalid block, but is below pruned threshold") + return nil + } + return fmt.Errorf("unexpected error notifying vote aggregator about invalid block: %w", err) + } + return nil +}Then use it at both call sites.
Also applies to: 356-374
466-469: Consider adding context to the pruning error.The error from
pending.PruneByViewis returned without additional context. For debugging, consider wrapping it with information about the finalized view being pruned.♻️ Suggested improvement
err := c.pending.PruneByView(finalized.View) if err != nil { - return err + return fmt.Errorf("could not prune pending blocks by view %d: %w", finalized.View, err) }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
engine/consensus/compliance/core.go
🧰 Additional context used
📓 Path-based instructions (4)
**/*.go
📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)
Follow Go coding conventions as documented in @docs/agents/CodingConventions.md
Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md
**/*.go: Follow the existing module structure in/module/,/engine/,/model/and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: runmake generate-mocksafter interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoidfmt.Errorf, useirrecoverablepackage for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)
Files:
engine/consensus/compliance/core.go
{module,engine,cmd}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
All major processing components must implement the
Componentinterface from/module/component/component.goto ensure consistent lifecycle management and graceful shutdown patterns
Files:
engine/consensus/compliance/core.go
{crypto,fvm,ledger,access,engine}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Cryptographic operations require careful handling; refer to crypto library documentation for proper implementation
Files:
engine/consensus/compliance/core.go
{network,engine,consensus}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Network messages must be authenticated and validated
Files:
engine/consensus/compliance/core.go
🧠 Learnings (1)
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to **/*.go : All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Applied to files:
engine/consensus/compliance/core.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Lint (./)
- GitHub Check: Lint (./integration/)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (network/p2p)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (state)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
🔇 Additional comments (3)
engine/consensus/compliance/core.go (3)
103-106: LGTM: Proper error handling in initialization.The initialization now correctly handles potential errors from
ProcessFinalizedBlock, preventing the Core from being constructed with inconsistent state. The error wrapping provides useful context for debugging.
297-304: Appropriate handling of active range constraint.The error handling for
mempool.IsBeyondActiveRangeErrorproperly implements the view-range enforcement mentioned in the PR objectives, providing an additional bound on memory usage beyond the existingSkipNewProposalsThreshold.
244-281: Early validation enhances BFT resilience.The early call to
validator.ValidateProposal(before checking parent connectivity) aligns with the BFT resilience strategy. This allows the compliance engine to:
- Reject structurally invalid proposals before caching
- Collect slashing evidence for invalid blocks
- Reduce resource exhaustion from byzantine blocks
The tradeoff is that disconnected blocks are validated even though they might be dropped later, but this is appropriate given the goal of detecting protocol violations early. The error handling correctly uses the documented API:
model.AsInvalidProposalError()for type-asserting invalid proposal errors anderrors.Is(err, model.ErrViewForUnknownEpoch)for checking the epoch sentinel error.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @engine/consensus/compliance/core.go:
- Around line 229-233: The view deduplication check in the compliance handler
uses len(blocksByView) > 1 which misses the case when one proposal already
exists; update the condition in the block processing (the code that calls
c.pending.ByView(block.View) where blocksByView is checked) to skip when any
proposal exists (use len(blocksByView) > 0 or >= 1) so that the proposal is not
reprocessed for a view that already has a stored vertex.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
engine/consensus/compliance/core.gomodule/buffer.gomodule/buffer/pending_blocks.go
🧰 Additional context used
📓 Path-based instructions (4)
**/*.go
📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)
Follow Go coding conventions as documented in @docs/agents/CodingConventions.md
Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md
**/*.go: Follow the existing module structure in/module/,/engine/,/model/and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: runmake generate-mocksafter interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoidfmt.Errorf, useirrecoverablepackage for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)
Files:
engine/consensus/compliance/core.gomodule/buffer.gomodule/buffer/pending_blocks.go
{module,engine,cmd}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
All major processing components must implement the
Componentinterface from/module/component/component.goto ensure consistent lifecycle management and graceful shutdown patterns
Files:
engine/consensus/compliance/core.gomodule/buffer.gomodule/buffer/pending_blocks.go
{crypto,fvm,ledger,access,engine}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Cryptographic operations require careful handling; refer to crypto library documentation for proper implementation
Files:
engine/consensus/compliance/core.go
{network,engine,consensus}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Network messages must be authenticated and validated
Files:
engine/consensus/compliance/core.go
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Include a high level overview of the problem and changes in the PR, being concise and avoiding unnecessary detail or boilerplate
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to **/*.go : All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Applied to files:
engine/consensus/compliance/core.go
🧬 Code graph analysis (3)
engine/consensus/compliance/core.go (5)
engine/enqueue.go (1)
Message(12-15)consensus/hotstuff/model/errors.go (3)
AsInvalidProposalError(200-207)InvalidProposalError(167-170)ErrViewForUnknownEpoch(17-17)module/mempool/errors.go (2)
IsBelowPrunedThresholdError(55-58)IsBeyondActiveRangeError(82-85)module/metrics/labels.go (1)
ResourceProposal(52-52)engine/collection/compliance/core.go (1)
Core(34-53)
module/buffer.go (4)
model/flow/block.go (2)
HashablePayload(14-16)GenericProposal(168-173)utils/unittest/fixtures/identifier.go (1)
Identifier(8-8)module/mock/generic_pending_block_buffer.go (1)
GenericPendingBlockBuffer(11-13)utils/unittest/fixtures/payload.go (1)
Payload(8-8)
module/buffer/pending_blocks.go (4)
model/flow/block.go (2)
HashablePayload(14-16)GenericProposal(168-173)module/forest/leveled_forest.go (2)
LevelledForest(24-29)NewLevelledForest(51-57)module/buffer.go (1)
PendingBlockBuffer(43-43)module/mempool/errors.go (1)
NewBeyondActiveRangeError(67-71)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (37)
- GitHub Check: Lint (./integration/)
- GitHub Check: Lint (./)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (network/p2p)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (state)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
🔇 Additional comments (13)
module/buffer.go (1)
8-40: Well-designed generic interface.The GenericPendingBlockBuffer interface is cleanly designed with proper generic type constraints, comprehensive documentation, and appropriate error returns. The interface methods support both querying (ByID, ByView, ByParentID) and maintenance operations (Add, PruneByView) with clear semantics.
engine/consensus/compliance/core.go (5)
103-106: Proper error propagation in initialization.The error handling for ProcessFinalizedBlock during initialization is correct and provides clear context in the error message.
117-168: Excellent documentation on BFT resilience.This comprehensive comment block clearly explains the attack vectors and mitigation strategies. It provides valuable context about:
- Protocol-level attacks vs. resource exhaustion attacks
- The compliance engine's responsibilities
- Specific mitigation mechanisms (leader validation, view thresholds, per-view deduplication)
245-283: Excellent early validation approach.Moving proposal validation before caching is a strong design choice that prevents invalid blocks from being stored and protects against memory exhaustion attacks. The comprehensive error handling includes:
- Proper slashing evidence collection for invalid proposals
- Notification to vote aggregator for tracking
- Graceful handling of unknown epoch views
This aligns well with the BFT resilience goals outlined in the comment block above.
292-316: Correct handling of disconnected blocks.The logic properly handles two cases:
- Parent already cached (2.1): Block is cached and waits for eventual resolution
- Parent not cached (2.2): Block is cached and parent is requested
The error handling for
BeyondActiveRangeErroris appropriate, and the sync request is only sent when the parent is not already in the cache.
465-477: Proper error handling in finalization processing.The updated signature correctly returns errors from pruning operations, and the finalized view/height counters are updated only after successful pruning, which maintains consistency.
module/buffer/pending_blocks.go (7)
13-41: Clean forest.Vertex implementation.The proposalVertex correctly implements the forest.Vertex interface with proper ID caching and straightforward accessors for level (view) and parent. The immutability directive ensures thread-safety.
43-70: Comprehensive documentation and design.The documentation thoroughly explains both depth and width attack mitigations:
- Depth: Limited by activeViewRangeSize
- Width: Limited by storing one proposal per view
The struct design is clean with proper concurrency control via mutex and efficient forest-backed storage.
78-94: Consistent constructor implementations.Both constructors correctly initialize the LevelledForest with
finalizedView + 1(since lowestLevel is inclusive) and include explanatory comments. The consistency between the two constructors is good.
96-122: Correct implementation of view range and width limiting.The Add method properly implements both safety mechanisms:
- Lines 109-114: Enforces depth limiting via activeViewRangeSize with appropriate error
- Lines 118-120: Enforces width limiting by only adding if no vertex exists at that view
The concurrency control via mutex is correct, and the no-op behavior for duplicate views aligns with the documented contract.
124-146: Correct retrieval methods.Both ByID and ByView are correctly implemented with proper locking and type conversions. ByView returns an empty slice (not nil) when no blocks are found, which is good defensive programming.
148-166: Efficient parent-to-children lookup.The ByParentID implementation efficiently preallocates the result slice and correctly returns (nil, false) when no children exist, matching the documented interface contract.
171-190: Correct pruning logic with proper offset handling.Both methods correctly handle the LevelledForest's semantics:
- PruneByView adds 1 because PruneUpToLevel is exclusive
- highestPrunedView subtracts 1 because LowestLevel is the lowest unpruned view
The arithmetic is correct and the comments clearly explain the rationale. The internal helper assumption (caller must hold lock) is appropriately documented.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
engine/consensus/compliance/core.go (1)
32-33: Fix comment: Method name is ProcessFinalizedBlock, not ProcessFinalizedView.The comment mentions
ProcessFinalizedView, but the actual method isProcessFinalizedBlock(line 465).📝 Proposed fix
-// - The only exception is calls to `ProcessFinalizedView`, which is the only concurrency-safe +// - The only exception is calls to `ProcessFinalizedBlock`, which is the only concurrency-safe // method of compliance.Core
🤖 Fix all issues with AI agents
In @engine/consensus/compliance/core.go:
- Around line 223-233: The current single-block-per-view cache (c.pending.ByView
/ blocksByView logic) can break liveness when a cached uncertified block must be
replaced by a later certified block; update the codebase and tests to ensure the
sync path handles replacement and that the view buffer is tuned: 1) audit and,
if needed, extend the sync engine code paths that fetch/replace blocks so that
when a certified block arrives for a view already present in c.pending it
triggers fetching and replaces the cached block (verify behavior in the sync
module and any methods that process certifications); 2) add unit/integration
tests that simulate caching B1 then certifying B2 for the same view and assert
B2 is fetched/applied and consensus proceeds; 3) expose or increase the view
buffering/window configuration used before pruning so the sync engine has time
to replace cached blocks, and document the trade-off in relevant comments (refer
to c.pending.ByView, blocksByView, and the sync engine replacement/fetch
routines).
🧹 Nitpick comments (1)
engine/consensus/compliance/core.go (1)
245-283: Good: Early validation prevents invalid blocks from consuming memory.Moving validation before caching is the correct approach to prevent memory exhaustion from invalid blocks. The error handling is comprehensive and covers all expected error types appropriately.
♻️ Minor optimization: Reuse hotstuffProposal variable
At line 266,
model.SignedProposalFromBlock(proposal.Message)is called again, buthotstuffProposalwas already created at line 253. Consider reusing it:// notify VoteAggregator about the invalid block - err = c.voteAggregator.InvalidBlock(model.SignedProposalFromBlock(proposal.Message)) + err = c.voteAggregator.InvalidBlock(hotstuffProposal) if err != nil {
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
engine/consensus/compliance/core.go
🧰 Additional context used
📓 Path-based instructions (4)
**/*.go
📄 CodeRabbit inference engine (.cursor/rules/coding_conventions.mdc)
Follow Go coding conventions as documented in @docs/agents/CodingConventions.md
Follow Go coding standards and conventions as documented in @docs/agents/GoDocs.md
**/*.go: Follow the existing module structure in/module/,/engine/,/model/and use dependency injection patterns for component composition
Implement proper interfaces before concrete types
Follow Go naming conventions and the project's coding style defined in /docs/CodingConventions.md
Use mock generators: runmake generate-mocksafter interface changes
All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Use comprehensive error wrapping for debugging; avoidfmt.Errorf, useirrecoverablepackage for exceptions
NEVER log and continue on best effort basis; ALWAYS explicitly handle errors
Uses golangci-lint with custom configurations (.golangci.yml) and custom linters for Flow-specific conventions (struct write checking)
Files:
engine/consensus/compliance/core.go
{module,engine,cmd}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
All major processing components must implement the
Componentinterface from/module/component/component.goto ensure consistent lifecycle management and graceful shutdown patterns
Files:
engine/consensus/compliance/core.go
{crypto,fvm,ledger,access,engine}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Cryptographic operations require careful handling; refer to crypto library documentation for proper implementation
Files:
engine/consensus/compliance/core.go
{network,engine,consensus}/**/*.go
📄 CodeRabbit inference engine (AGENTS.md)
Network messages must be authenticated and validated
Files:
engine/consensus/compliance/core.go
🧠 Learnings (1)
📚 Learning: 2025-12-23T00:28:41.005Z
Learnt from: CR
Repo: onflow/flow-go PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-23T00:28:41.005Z
Learning: Applies to **/*.go : All inputs must be considered potentially byzantine; error classification is context-dependent and no code path is safe unless explicitly proven and documented
Applied to files:
engine/consensus/compliance/core.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (36)
- GitHub Check: Lint (./)
- GitHub Check: Lint (./insecure/)
- GitHub Check: Lint (./integration/)
- GitHub Check: Integration Tests Others (integration)
- GitHub Check: Unit Tests (engine/execution/computation)
- GitHub Check: Unit Tests (engine/consensus)
- GitHub Check: Unit Tests (network/p2p/scoring)
- GitHub Check: Unit Tests (module/dkg)
- GitHub Check: Unit Tests (network/alsp)
- GitHub Check: Unit Tests (network/p2p/connection)
- GitHub Check: Unit Tests (network)
- GitHub Check: Unit Tests (network/p2p/node)
- GitHub Check: Unit Tests (others)
- GitHub Check: Unit Tests (engine/execution)
- GitHub Check: Unit Tests (engine)
- GitHub Check: Unit Tests (admin)
- GitHub Check: Unit Tests (network/test/cohort2)
- GitHub Check: Unit Tests (network/test/cohort1)
- GitHub Check: Unit Tests (engine/common)
- GitHub Check: Unit Tests (storage)
- GitHub Check: Unit Tests (module)
- GitHub Check: Unit Tests (engine/execution/ingestion)
- GitHub Check: Unit Tests (engine/verification)
- GitHub Check: Unit Tests (engine/access)
- GitHub Check: Unit Tests (ledger)
- GitHub Check: Unit Tests (state)
- GitHub Check: Unit Tests (utils)
- GitHub Check: Unit Tests (engine/collection)
- GitHub Check: Unit Tests (cmd)
- GitHub Check: Unit Tests (consensus)
- GitHub Check: Unit Tests (fvm)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/scoring)
- GitHub Check: Unit Tests Insecure (insecure)
- GitHub Check: Unit Tests Insecure (insecure/integration/functional/test/gossipsub/rpc_inspector)
- GitHub Check: Docker Build
- GitHub Check: analyze-code (go)
🔇 Additional comments (6)
engine/consensus/compliance/core.go (6)
103-106: LGTM: Proper error handling for initialization.The error handling for
ProcessFinalizedBlockis correct. Failing fast during initialization if the finalized block cannot be processed prevents the Core from starting in an inconsistent state.
117-168: Excellent BFT resilience documentation.This comprehensive documentation clearly explains the threat model, attack categories, and mitigation strategies. The distinction between protocol-level attacks (type I) and resource exhaustion attacks (type II) is well articulated, and the rationale for storing a single proposal per view is clearly explained.
311-314: LGTM: Parent request logic is correct.The logic correctly checks if the parent is already pending before requesting it, preventing duplicate sync requests. The parent height calculation is also correct.
434-434: LGTM: Extracting hotstuffProposal for reuse.Creating the
hotstuffProposalvariable once and reusing it at lines 440, 452, and 457 is more efficient than creating it multiple times.
464-477: LGTM: Proper error propagation from PruneByView.The function signature change to return errors is correct. Pruning failures should be escalated as they indicate potential corruption of the pending buffer state.
29-33: PendingBlockBuffer is properly protected for concurrent access.The concern about concurrent access between
OnBlockProposalandProcessFinalizedBlockis addressed by thePendingBlockBufferimplementation. Both the interface and concrete implementation are explicitly documented as "Safe for concurrent use" and use a mutex to synchronize all operations, including the pattern of pruning while adding and querying blocks.
| // 1. blocks already in the cache, that are disconnected: they will be processed later. | ||
| // 2. blocks already in the cache, that were already processed: they will be eventually pruned by view. | ||
| // 3. blocks already on disk: they were processed and await finalization | ||
|
|
||
| // 1,2. To prevent memory exhaustion attacks we store single proposal per view, so we can ignore | ||
| // all other proposals if we have already cached something. | ||
| blocksByView := c.pending.ByView(block.View) | ||
| if len(blocksByView) > 0 { | ||
| log.Debug().Msg("skipping proposal since we have already processed one for given view") | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Verify liveness guarantees with single-block-per-view caching.
The deduplication strategy rejects all subsequent proposals for a view once any proposal for that view is cached. While this effectively prevents memory exhaustion attacks by limiting the cache to one block per view, this creates a potential liveness concern:
- If we cache valid block B1 for view V, then block B2 for view V gets certified, we'll reject B2 and must rely on the sync engine to retrieve it.
The mitigation mentioned in the comments at lines 163-166 (syncing certified blocks) is reasonable, but verification of this is essential:
- Confirm the sync engine robustly handles the scenario where a cached but uncertified block must be replaced with a certified alternative
- Verify this trade-off (memory safety vs. protocol liveness) has been validated in tests
- Ensure the view buffering window provides sufficient margin for syncing to occur before the window closes
🤖 Prompt for AI Agents
In @engine/consensus/compliance/core.go around lines 223 - 233, The current
single-block-per-view cache (c.pending.ByView / blocksByView logic) can break
liveness when a cached uncertified block must be replaced by a later certified
block; update the codebase and tests to ensure the sync path handles replacement
and that the view buffer is tuned: 1) audit and, if needed, extend the sync
engine code paths that fetch/replace blocks so that when a certified block
arrives for a view already present in c.pending it triggers fetching and
replaces the cached block (verify behavior in the sync module and any methods
that process certifications); 2) add unit/integration tests that simulate
caching B1 then certifying B2 for the same view and assert B2 is fetched/applied
and consensus proceeds; 3) expose or increase the view buffering/window
configuration used before pruning so the sync engine has time to replace cached
blocks, and document the trade-off in relevant comments (refer to
c.pending.ByView, blocksByView, and the sync engine replacement/fetch routines).
| if err := c.pending.Add(proposal); err != nil { | ||
| if mempool.IsBeyondActiveRangeError(err) { | ||
| // In general, we expect the block buffer to use SkipNewProposalsThreshold, | ||
| // however since it is instantiated outside this component, we allow the thresholds to differ | ||
| log.Debug().Err(err).Msg("dropping block beyond block buffer active range") | ||
| return nil | ||
| } | ||
| return fmt.Errorf("could not add proposal to pending buffer: %w", err) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find PendingBlockBuffer interface definition
echo "=== Searching for PendingBlockBuffer interface ==="
rg -n --type=go "type PendingBlockBuffer" -A 20
echo -e "\n=== Searching for mempool error types ==="
rg -n --type=go "(IsBeyondActiveRangeError|IsOutOfBoundsError|IsInvalidError)" -A 2 -B 2
echo -e "\n=== Checking PendingBlockBuffer.Add implementations ==="
rg -n --type=go "func.*\(.*\).*Add.*Slashable.*Proposal.*error" -A 15
echo -e "\n=== Searching for mempool package errors ==="
fd -e go "mempool" | head -20 | xargs grep -l "error\|Error" 2>/dev/nullRepository: onflow/flow-go
Length of output: 8667
🏁 Script executed:
#!/bin/bash
echo "=== Checking irrecoverable package usage in compliance components ==="
rg -n --type=go "irrecoverable" engine/consensus/compliance/ -B 2 -A 2
echo -e "\n=== Checking error handling patterns in core.go ==="
rg -n --type=go "return fmt.Errorf|return irrecoverable" engine/consensus/compliance/core.go -B 1 -A 1
echo -e "\n=== Checking how BeyondActiveRangeError should be handled ==="
rg -n --type=go "log.*BeyondActiveRange|return nil" module/buffer/pending_blocks_test.go -B 2 -A 2
echo -e "\n=== Check mempool errors package ==="
cat -n module/mempool/errors.go | head -100Repository: onflow/flow-go
Length of output: 6612
Verify error handling aligns with coding guidelines.
All error types from PendingBlockBuffer.Add() are handled: only BeyondActiveRangeError is returned (when activeViewRangeSize > 0), and it is explicitly checked and caught. However, the current handling violates Flow-Go guidelines:
- Lines 300-305: Logging and returning
nilviolates the guideline "NEVER log and continue on best effort basis; ALWAYS explicitly handle errors" - Line 306: Using
fmt.Errorfviolates the guideline "avoidfmt.Errorf, useirrecoverablepackage"
Per guidelines, either escalate the error through the irrecoverable package or use proper error wrapping. The comment documenting the decision is helpful but must align with the error handling standards.
|
This PR partially handles BFT processing of pending proposals. Read #8325 for extra details. Without actively working on the respective epic, we need to postpone merging this PR as it has a liveness issue. |
This PR refactors the
PendingBlockBufferused by compliance engines to store disconnected blocks.Proposaltypes using generics (similar to Block)PendingBlockBufferusing generics, so we have one type-safe implementationPendingBlockBufferto useLevelledForestinstead of bespokebackendDropForParentIDmethod, becauseLevelledForesthas no parallel of this (it's a very minor memory optimization, and has the potential to drop unprocessed blocks in a concurrent environment)ActiveViewRangeSizeto thePendingBlockBuffer, which puts an upper limit on what views we will accept when adding blocks.SkipNewProposalsThreshold), however including it at thePendingBlockBufferlayer as well is inexpensive and provides a more isolated guarantee that the memory usage is bounded, so I was inclined to include it here as well.See #8170 for detail.
NOTE: Some of these changes (primarily the test suite) were done with AI (Cursor CLI - Composer 1). I marked commits which were substantially authored by AI with an
ai:prefix (just so you are aware -- I have reviewed them 😃).Summary by CodeRabbit
Bug Fixes
New Features / Refactor
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.