Skip to content

chore: merge main to develop post release v2.1.0 #1273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 24, 2025
Merged

chore: merge main to develop post release v2.1.0 #1273

merged 10 commits into from
Mar 24, 2025

Conversation

Yashk767
Copy link
Contributor

@Yashk767 Yashk767 commented Mar 20, 2025

User description

Description

Merging main to develop post releasing v2.1.0.


PR Type

Enhancement, Bug fix, Tests, CI/CD


Description

  • Enhanced block monitoring with RPC endpoint switching and stale block detection.

  • Fixed bugs in command dependency initialization and block fetching logic.

  • Added comprehensive tests for block monitoring and RPC manager functionality.

  • Updated CI workflows to use newer action versions and improved artifact handling.


Changes walkthrough 📝

Relevant files
Enhancement
28 files
block.go
Add RPC endpoint switching and stale block detection         
+20/-0   
addStake.go
Update command dependencies to include block monitor         
+1/-1     
claimBounty.go
Update command dependencies to include block monitor         
+1/-1     
claimCommission.go
Update command dependencies to include block monitor         
+1/-1     
cmd-utils.go
Modify InitializeCommandDependencies to return block monitor
+8/-8     
collectionList.go
Update command dependencies to include block monitor         
+1/-1     
contractAddresses.go
Update command dependencies to include block monitor         
+1/-1     
create.go
Simplify command dependencies for create command                 
+4/-2     
createCollection.go
Update command dependencies to include block monitor         
+1/-1     
createJob.go
Update command dependencies to include block monitor         
+1/-1     
delegate.go
Update command dependencies to include block monitor         
+1/-1     
import.go
Simplify command dependencies for import command                 
+4/-2     
initiateWithdraw.go
Update command dependencies to include block monitor         
+1/-1     
interface.go
Update interface to include block monitor in voting           
+2/-1     
jobList.go
Update command dependencies to include block monitor         
+1/-1     
modifyCollectionStatus.go
Update command dependencies to include block monitor         
+1/-1     
resetUnstakeLock.go
Update command dependencies to include block monitor         
+1/-1     
setConfig.go
Simplify command dependencies for setConfig command           
+2/-2     
setDelegation.go
Update command dependencies to include block monitor         
+1/-1     
stakerInfo.go
Update command dependencies to include block monitor         
+1/-1     
transfer.go
Update command dependencies to include block monitor         
+1/-1     
unlockWithdraw.go
Update command dependencies to include block monitor         
+1/-1     
unstake.go
Update command dependencies to include block monitor         
+1/-1     
updateCollection.go
Update command dependencies to include block monitor         
+1/-1     
updateCommission.go
Update command dependencies to include block monitor         
+1/-1     
updateJob.go
Update command dependencies to include block monitor         
+1/-1     
vote.go
Integrate block monitor into voting logic                               
+7/-6     
rpc.go
Improve RPC endpoint switching logic                                         
+7/-2     
Tests
5 files
block_test.go
Add tests for block monitoring logic                                         
+240/-0 
utils_cmd_interface.go
Update mocks to include block monitor in voting                   
+5/-4     
propose_test.go
Add test for proposed blocks count mismatch                           
+15/-0   
vote_test.go
Update tests for voting logic with block monitor                 
+4/-8     
rpc_test.go
Add tests for RPC manager functionality                                   
+320/-0 
Bug fix
1 files
propose.go
Add validation for proposed blocks count                                 
+6/-0     
Configuration changes
1 files
version.go
Update version to 2.1.0                                                                   
+1/-1     
Ci/cd
3 files
ci.yml
Update CI workflows to use newer action versions                 
+9/-9     
develop.yml
Update CI workflows to use newer action versions                 
+4/-4     
release.yml
Update CI workflows to use newer action versions                 
+4/-4     

Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.
  • Yashk767 and others added 8 commits February 20, 2025 15:50
    * refactor: added tests for RPC module
    
    * Revert "refactor: added tests for RPC module"
    
    This reverts commit bac5b3b.
    
    * reafctor: added tests for rpc module using httpstest servers
    
    * ci: updated artifact version for binary downloads and uploads
    
    * ci: fixed ci by updateing action versions
    
    * reafctor: replaced ts.cancel() with  t.Cleanup(func() { ts1.Close() })
    …er (#1264)
    
    * fix: switch RPC if block number is less than last recorded block number
    
    * refactor: added tests for block module
    …ocess (#1269)
    
    * refactor: fetched block number from block monitor in voting
    
    * refactor: fixed tests
    
    * fix: added nil check for header from block monitor
    
    * refactor: replaced empty block monitor reference with nil in case of error
    …1271)
    
    * hotfix: removed getConfigData() call from setConfig command
    
    * chore: update version to v2.1.0
    
    * refactor: removed unwanted 'InitializeCommandDependencies()' from import and create command
    Copy link

    @astronaut-aldrin astronaut-aldrin bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Error during code review: 'NoneType' object is not subscriptable

    Copy link

    @astronaut-aldrin astronaut-aldrin bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Error during code review: 'NoneType' object is not subscriptable

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Resource Leak

    The context cancellation function is not called in error paths when connecting to RPC endpoints, which could lead to resource leaks. While the code now properly calls cancel() in some error paths, it's still missing in the successful connection path before updating the best endpoint.

    client, err := ethclient.DialContext(ctx, nextEndpoint.URL)
    if err != nil {
    	cancel()
    	logrus.Errorf("Failed to connect to RPC endpoint %s: %v", nextEndpoint.URL, err)
    	continue
    }
    
    // Try fetching block number to validate the endpoint
    _, err = m.fetchBlockNumberWithTimeout(ctx, client)
    if err != nil {
    	cancel()
    	logrus.Errorf("Failed to fetch block number for endpoint %s: %v", nextEndpoint.URL, err)
    	continue
    }
    
    // Cancel the context after the operations complete.
    cancel()
    
    // Successfully connected and validated, update the best client and endpoint
    // (Make sure that the client is correctly associated with the endpoint.)
    nextEndpoint.Client = client
    m.BestEndpoint = nextEndpoint
    Array Bounds Check

    The new check for numOfProposedBlocks and sortedProposedBlocks length is good, but accessing sortedProposedBlocks[numOfProposedBlocks-1] could still panic if numOfProposedBlocks is 0, as the check allows numOfProposedBlocks to be 0.

    if numOfProposedBlocks <= 0 || len(sortedProposedBlocks) < int(numOfProposedBlocks) {
    	log.Errorf("Invalid numOfProposedBlocks (%d) or mismatch with sortedProposedBlocks length (%d)", numOfProposedBlocks, len(sortedProposedBlocks))
    	return errors.New("proposed blocks count mismatch")
    }
    
    lastBlockIndex := sortedProposedBlocks[numOfProposedBlocks-1]
    Concurrency Issue

    In the updateLatestBlock method, there's a potential race condition. If the RPC endpoint switch is successful, updateClient() is called while still holding the mutex lock, which could lead to deadlock if updateClient also tries to acquire the same mutex.

    } else if switched {
    	logrus.Info("Switched to the next best RPC endpoint.")
    	bm.updateClient()
    } else {

    Copy link

    qodo-merge-pro-for-open-source bot commented Mar 20, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    Possible issue
    Fix premature context cancellation

    The cancel() function is called immediately after creating the context, which
    cancels it before the subsequent operations can use it. This will cause the
    context to be already canceled when attempting to establish the connection,
    likely resulting in immediate connection failures. Move the cancel() call after
    all operations using the context are complete.

    rpc/rpc.go [203-210]

     // Check if we can connect to the next endpoint
     ctx, cancel := context.WithTimeout(context.Background(), core.EndpointsContextTimeout*time.Second)
    -cancel()
     
     client, err := ethclient.DialContext(ctx, nextEndpoint.URL)
    +if err != nil {
    +    cancel()
    +    logrus.Errorf("Failed to connect to RPC endpoint %s: %v", nextEndpoint.URL, err)
    +    continue
    +}

    [To ensure code accuracy, apply this suggestion manually]

    Suggestion importance[1-10]: 10

    __

    Why: This suggestion addresses a critical issue where the context is canceled immediately after creation, causing all subsequent operations to fail. Moving the cancel() call after operations are complete ensures proper connection establishment.

    High
    Fix non-existent Go version

    The Go version 1.23 specified in the workflow doesn't exist yet. The latest
    stable Go version as of March 2025 would be around 1.22.x. Using a non-existent
    version will cause the workflow to fail.

    .github/workflows/release.yml [19-22]

     - name: Setup Go
       uses: actions/setup-go@v2
       with:
    -    go-version: 1.23
    +    go-version: 1.22
    • Apply this suggestion
    Suggestion importance[1-10]: 10

    __

    Why: The suggestion correctly identifies a critical issue where the workflow specifies Go version 1.23, which doesn't exist yet. This would cause the entire workflow to fail, making this a high-priority fix.

    High
    Fix client assignment

    The code is incorrectly assigning the client to the old endpoint
    (m.BestEndpoint.Client = client) before overwriting the entire m.BestEndpoint
    reference with nextEndpoint. This means the client connection is lost. Instead,
    assign the client to the nextEndpoint before setting it as the best endpoint.

    rpc/rpc.go [224-232]

     // Try fetching block number to validate the endpoint
     _, err = m.fetchBlockNumberWithTimeout(ctx, client)
     if err != nil {
         logrus.Errorf("Failed to fetch block number for endpoint %s: %v", nextEndpoint.URL, err)
         continue
     }
     
    +// Cancel the context after the operations complete.
    +cancel()
    +
     // Successfully connected and validated, update the best client and endpoint
    -m.BestEndpoint.Client = client
    +// (Make sure that the client is correctly associated with the endpoint.)
    +nextEndpoint.Client = client
     m.BestEndpoint = nextEndpoint

    [To ensure code accuracy, apply this suggestion manually]

    Suggestion importance[1-10]: 9

    __

    Why: This suggestion fixes a critical bug where the client connection would be lost due to incorrect assignment order. The fix properly associates the client with the new endpoint before setting it as the best endpoint, preventing connection issues.

    High
    Prevent index out of range

    The code attempts to access an element in the sortedProposedBlocks array without
    validating that numOfProposedBlocks is within the array's bounds. If
    numOfProposedBlocks is larger than the length of sortedProposedBlocks, this will
    cause an index out of range panic. Add validation to ensure numOfProposedBlocks
    is valid and within the array's bounds.

    cmd/propose.go [115-123]

     log.Debug("Propose: Sorted proposed blocks: ", sortedProposedBlocks)
    +
    +if numOfProposedBlocks <= 0 || len(sortedProposedBlocks) < int(numOfProposedBlocks) {
    +    log.Errorf("Invalid numOfProposedBlocks (%d) or mismatch with sortedProposedBlocks length (%d)", numOfProposedBlocks, len(sortedProposedBlocks))
    +    return errors.New("proposed blocks count mismatch")
    +}
    +
     lastBlockIndex := sortedProposedBlocks[numOfProposedBlocks-1]
     log.Debug("Propose: Last block index: ", lastBlockIndex)
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    __

    Why: This suggestion prevents a potential panic by adding validation to ensure numOfProposedBlocks is within the bounds of the sortedProposedBlocks array. This is a critical safety check that prevents runtime crashes.

    High
    • Update

    Copy link

    @astronaut-aldrin astronaut-aldrin bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Error during code review: 'NoneType' object is not subscriptable

    Copy link

    @astronaut-aldrin astronaut-aldrin bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🚀 Aldrin's Code Review 🌕


    Overall Summary

    This PR introduces RPC endpoint switching, stale block detection, and comprehensive testing. Key enhancements include improved fault tolerance in block monitoring and updated CI workflows. The changes align with the goals of boosting reliability and test coverage. Let's dive into the details!


    Issues Found

    🚨 1. Potential Race Condition in RPCManager Switch (rpc.go)

    Code Snippet:

    func (m *RPCManager) SwitchToNextBestRPCClient() (bool, error) {
        // ... 
        for i, endpoint := range m.Endpoints {
            if endpoint.URL == m.BestEndpoint.URL {
                startIdx = i + 1
                break
            }
        }
        // Iterate remaining endpoints to find the next best
    }

    Commentary:
    ⚠️ Warning (2/5)
    The SwitchToNextBestRPCClient method isn’t thread-safe. Concurrent calls could lead to inconsistent BestEndpoint state. In production, this might cause nodes to switch to suboptimal endpoints under high load.

    Fix Suggestion:
    Wrap the method body with a mutex to ensure atomicity:

    var switchMutex sync.Mutex
    
    func (m *RPCManager) SwitchToNextBestRPCClient() (bool, error) {
        switchMutex.Lock()
        defer switchMutex.Unlock()
        // ... existing logic ...
    }
    🔥 2. Missing Context Cancellation in RPC Client Setup (rpc.go)

    Code Snippet:

    ctx, cancel := context.WithTimeout(...)
    client, err := ethclient.DialContext(ctx, nextEndpoint.URL)
    if err != nil {
        cancel()
        // ...
    }
    // ... block fetch ...
    cancel()

    Commentary:
    🔥 Critical (3/5)
    Calling cancel() only on error risks leaving the context active if the block fetch succeeds but later operations hang. This could lead to resource leaks.

    Fix Suggestion:
    Use defer cancel() immediately after context creation:

    ctx, cancel := context.WithTimeout(...)
    defer cancel()
    // ... rest of the code ...
    ⚠️ 3. Unvalidated Block Header in Vote Logic (vote.go)

    Code Snippet:

    latestHeader := blockMonitor.GetLatestBlock()
    if latestHeader == nil {
        log.Error("Block monitor returned nil header")
        continue
    }

    Commentary:
    ⚠️ Warning (2/5)
    While nil checks are present, a nil header could stall voting indefinitely. Add a retry-with-timeout mechanism to recover from transient monitor failures.

    Fix Suggestion:

    retries := 3
    for i := 0; i < retries; i++ {
        latestHeader := blockMonitor.GetLatestBlock()
        if latestHeader != nil {
            break
        }
        time.Sleep(2 * time.Second)
    }
    if latestHeader == nil {
        return errors.New("failed to fetch block after retries")
    }
    ⚠️ 4. Hardcoded Context Timeout in RPC Manager (rpc.go)

    Code Snippet:

    ctx, cancel := context.WithTimeout(context.Background(), core.EndpointsContextTimeout*time.Second)

    Commentary:
    ⚠️ Warning (1/5)
    The timeout is hardcoded via core.EndpointsContextTimeout. If this value isn’t configurable, operators can’t adjust it for varying network conditions.

    Fix Suggestion:
    Make the timeout configurable via environment variables or CLI parameters.


    Reviewed Files

    • block.go, rpc.go, propose.go, vote.go, cmd-utils.go
    • CI files: ci.yml, develop.yml, release.yml
    • Tests: block_test.go, rpc_test.go, propose_test.go

    Positive Feedback

    🌈 Well Done!

    • Tests: The block_test.go and rpc_test.go are stellar! Mock servers and atomic counters for RPC switching? Brilliant.
    • Stale Handling: The block monitor’s RPC failover logic is clean and production-grade.
    • Safety Checks: Adding validation in propose.go for proposed blocks count is a proactive move against index errors 👏.

    Keep those tests coming—they’re the astronaut’s safety harness! 👨🚀🔗


    Review generated by: perplexity/r1-1776

    @Yashk767 Yashk767 merged commit 068d798 into develop Mar 24, 2025
    10 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    4 participants