Skip to content

Conversation

@izquierdo-datachain
Copy link
Contributor

@izquierdo-datachain izquierdo-datachain commented Jun 27, 2025

Visiting every block when searching for the hard fork boundary (when the boundary is specified by timestamp instead of by height) can cause the relayer to hang for a long time without processing packets.

This PR implements a faster version of the boundary search.

@izquierdo-datachain izquierdo-datachain force-pushed the faster-get-boundary-height branch 2 times, most recently from 0824161 to 6cecbb9 Compare June 27, 2025 07:46
@izquierdo-datachain izquierdo-datachain marked this pull request as ready for review June 27, 2025 07:58
@izquierdo-datachain izquierdo-datachain force-pushed the faster-get-boundary-height branch from 6cecbb9 to b174867 Compare June 30, 2025 01:44

avgBlocksPerMs := blockCountPrevCur / float64(timeDiffPrevCur)

if avgBlocksPerMs <= 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What scenario is this condition indented for? From what I can tell in the code, this condition will never be met because both blockCountPrevCur and timeDiffPrevCur appears to always be positive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll answer separately for the two sub-scenarios: avgBlocksPerMs == 0 and avgBlocksPerMs < 0:

  • avgBlocksPerMs == 0

With float64 division, this number can become 0 for if blockCountPrevCur is very small and timeDiffPrevCur is very big. In normal scenarios this shouldn't happen, but I left it as a prevention measure.

For reference, there's one scenario I found where avgBlocksPerMs would be very small which occurs when running a new test node locally, because block 0 has a timestamp from year 2020, and blocks 1 and further have a timestamp of e.g. 2025, so avgBlocksPerMs ends up being something like "1 block per every 5 years". While this particular scenario is not enough to make the average rate be 0, I preferred to be defensive and catch that potential scenario.

  • avgBlocksPerMs < 0

I agree that from this code, this condition can never be met. However, if we're going to be checking for == 0 I thought we might as well also check for <. Ideally what I would prefer here is an assertion statement though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation!
I tried to reproduce the scenario you described but I couldn't: https://go.dev/play/p/6CHHW7OZqC9

Would the following change make sense instead?

diff --git a/module/fork_spec.go b/module/fork_spec.go
index 46d3584..d7716cd 100644
--- a/module/fork_spec.go
+++ b/module/fork_spec.go
@@ -346,6 +346,13 @@ func estimateDistance(previousHeader, currentHeader *types.Header, targetTs uint
        blockCountPrevCurBig = blockCountPrevCurBig.Abs(blockCountPrevCurBig)
        blockCountPrevCur, _ := blockCountPrevCurBig.Float64()

+       if blockCountPrevCur == 0 {
+               // Blocks are being produced so slowly that the current block is still expected
+               // to be the latest block at any future timestamp. Return 1 to avoid getting stuck
+               // in the current block.
+               return 1
+       }
+
        if currentTs > previousTs {
                timeDiffPrevCur = currentTs - previousTs
        } else {
@@ -365,13 +372,5 @@ func estimateDistance(previousHeader, currentHeader *types.Header, targetTs uint
        }

        avgBlocksPerMs := blockCountPrevCur / float64(timeDiffPrevCur)
-
-       if avgBlocksPerMs <= 0 {
-               // Blocks are being produced so slowly that the current block is still expected
-               // to be the latest block at any future timestamp. Return 1 to avoid getting stuck
-               // in the current block.
-               return 1
-       }
-
        return uint64(math.Ceil(avgBlocksPerMs * float64(timeDiffTargetCur)))
 }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see from the code that it cannot be lower than 0, but I think the reason for the discussion is that avgBlocksPerMs <= 0 is explicitly stated.
The following reverse branching might be better.

if avgBlocksPerMs > 0 {
      return uint64(math.Ceil(avgBlocksPerMs * float64(timeDiffTargetCur)))
}
return 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the following change make sense instead?

While the early check makes sense, I would still like to check the result of avgBlocksPerMs. To clarify more:

https://go.dev/play/p/UwgHE3fMCUf

outputs

a > 0? true
b > 0? true
(a / b) == 0? true

So with float64, a > 0 && b > 0 does not imply a / b > 0 (even though in general math the implication does hold).

I tried to reproduce the scenario you described but I couldn't

Sorry for the confusion. In my local testing, avgBlocksPerMs did not become equal to 0 either. I only used that as an example of how the number can have an unexpectedly low value even though usually blocks are produced at a sane rate, which makes me worried about an unknown scenario where it becomes 0 causing the relayer to become stuck.

}, nil
}

func searchBoundaryHeight(ctx context.Context, currentHeight uint64, targetTs uint64, headerFn getHeaderFn) (uint64, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really efficient and clever approach 👍 👍 👍

izquierdo-datachain and others added 2 commits July 1, 2025 08:52
Co-authored-by: Takeshi Arabiki <takeshi.arabiki@gmail.com>
Co-authored-by: Takeshi Arabiki <takeshi.arabiki@gmail.com>
Copy link
Contributor

@yoshidan yoshidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a very nice and efficient logic.
I would suggest adding test code to searchBoundaryHeight and estimateDistance to prevent degreasing when modifying.


avgBlocksPerMs := blockCountPrevCur / float64(timeDiffPrevCur)

if avgBlocksPerMs <= 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see from the code that it cannot be lower than 0, but I think the reason for the discussion is that avgBlocksPerMs <= 0 is explicitly stated.
The following reverse branching might be better.

if avgBlocksPerMs > 0 {
      return uint64(math.Ceil(avgBlocksPerMs * float64(timeDiffTargetCur)))
}
return 1

Copy link
Contributor

@abicky abicky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@abicky abicky requested a review from yoshidan July 2, 2025 04:04
Copy link
Contributor

@yoshidan yoshidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the test code for searchBoundaryHeight and estimateDistance.

izquierdo-datachain and others added 2 commits July 3, 2025 15:09
Co-authored-by: Takeshi Arabiki <takeshi.arabiki@gmail.com>
Co-authored-by: Takeshi Arabiki <takeshi.arabiki@gmail.com>
@izquierdo-datachain
Copy link
Contributor Author

Thanks for catching the mixed up test names!

@izquierdo-datachain izquierdo-datachain merged commit 9c5cd37 into main Jul 3, 2025
2 checks passed
@izquierdo-datachain izquierdo-datachain deleted the faster-get-boundary-height branch July 3, 2025 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants