-
Notifications
You must be signed in to change notification settings - Fork 2.2k
rpcserver: avoid false negatives in dispatcher during the initial sync #10280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
rpcserver: avoid false negatives in dispatcher during the initial sync #10280
Conversation
Summary of ChangesHello @mohamedawnallah, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This PR optimizes neutrino rescans by integrating the wallet's birthday into the rescan logic, significantly speeding up blockchain synchronization for newly created wallets. It modifies Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces an optimization for neutrino rescan by using the wallet's birthday. The changes are well-contained and the logic is sound. I've identified a few areas for improvement, mainly related to code formatting and documentation, to align with the repository's style guide.
0f26e5b
to
dae9086
Compare
Can we get a brief recap in the OP re: what happened, was it a regression, why this fixes it? Another follow up: if the regression was introduced in How can we develop a regression test for this issue? |
Hi @shocknet-justin, Would it be possible to try this potential patch if that resolves your issue (#10180)? I can try to write a reproducible test (admittedly not the easiest thing to do) but having +1/-1 that this changeset works/not works can be helpful |
Thank you for looking into this one @mohamedawnallah. I believe I'm experiencing the same issue as @shocknet-justin. I tried running your patches, but it seems like there's still weird behavior on first run.
The issue can be seen with:
Let me know if I can help you or perhaps test something. |
I've got some time sensitive materials I need to finish today but will test in the next day or two. It looks like Hampus is still reproducing it however, the --noseedbackup flag I believe should be triggering the wallet create as quickly as my deployment scripts would. |
dae9086
to
16e24f1
Compare
Benchmarking ReportWith the changeset in this PRWe bring LND v0.20.0 latest release candidate to normal initial sync times that was previously taking days to complete under ideal environmental conditions now takes just over a minute. Benchmarks in Raw Formatdev@dev:~/lnd$ sudo ./compare_lnd_versions.sh
==========================================
LND Version Comparison Benchmark
==========================================
Comparing 4 versions:
- v0.18.3-beta
- v0.18.4-beta
- v0.18.5-beta
- v0.20.0-beta.rc1-after
Each version will run in isolated namespaces
for fair comparison.
>>> Running v0.18.3-beta (1/4)...
==========================================
Benchmarking LND v0.18.3-beta
Using temporary storage: /tmp/lnd-bench-v0.18.3-beta-tQ9SWS
Resource Limits:
- CPU: 4 cores
- Memory: 8G
==========================================
Starting LND in isolated namespace with tmpfs...
Started LND (PID: 851327)
Monitoring sync status...
To manually check status, run:
sudo nsenter --target 851327 --pid --mount --uts --ipc \
/home/dev/lnd/lncli-debug-v0.18.3-beta --lnddir=/tmp/lnd_data_ns/lnd getinfo
[Sample 0] Block: unknown | synced: false | Memory: 348.76 MB | CPU: 0%
[Sample 1] Block: unknown | synced: false | Memory: 437.65 MB | CPU: 223.00%
[Sample 2] Block: 150000 | synced: false | Memory: 119.67 MB | CPU: 190.00%
[Sample 3] Block: 243000 | synced: false | Memory: 131.54 MB | CPU: 158.00%
[Sample 4] Block: 337000 | synced: false | Memory: 150.61 MB | CPU: 171.00%
[Sample 5] Block: 412000 | synced: false | Memory: 160.77 MB | CPU: 151.00%
[Sample 6] Block: 464000 | synced: false | Memory: 174.42 MB | CPU: 143.00%
[Sample 7] Block: 536000 | synced: false | Memory: 193.35 MB | CPU: 157.00%
[Sample 8] Block: 635000 | synced: false | Memory: 207.39 MB | CPU: 167.00%
[Sample 9] Block: 700000 | synced: false | Memory: 217.50 MB | CPU: 153.00%
[Sample 10] Block: 764000 | synced: false | Memory: 242.05 MB | CPU: 154.00%
[Sample 11] Block: 822000 | synced: false | Memory: 253.53 MB | CPU: 152.00%
[Sample 12] Block: 914000 | synced: false | Memory: 278.78 MB | CPU: 144.00%
synced_to_chain
✓ synced_to_chain: true (block height: 918649)
Stopping LND...
========================================
BENCHMARK RESULTS - v0.18.3-beta
========================================
Execution Time: 70.434696628 seconds
Peak Memory Usage: 437.65 MB
Average Memory Usage: 231.76 MB
Average CPU Usage: 157.76%
Samples Collected: 14
Results saved to: benchmark_v0.18.3-beta.json
Temporary directory will be cleaned up automatically
Cleaning up temporary directory: /tmp/lnd-bench-v0.18.3-beta-tQ9SWS
Waiting 10 seconds before next benchmark...
>>> Running v0.18.4-beta (2/4)...
==========================================
Benchmarking LND v0.18.4-beta
Using temporary storage: /tmp/lnd-bench-v0.18.4-beta-2Z0bCV
Resource Limits:
- CPU: 4 cores
- Memory: 8G
==========================================
Starting LND in isolated namespace with tmpfs...
Started LND (PID: 852086)
Monitoring sync status...
To manually check status, run:
sudo nsenter --target 852086 --pid --mount --uts --ipc \
/home/dev/lnd/lncli-debug-v0.18.4-beta --lnddir=/tmp/lnd_data_ns/lnd getinfo
[Sample 0] Block: unknown | synced: false | Memory: 363.35 MB | CPU: 0%
[Sample 1] Block: unknown | synced: false | Memory: 454.38 MB | CPU: 240.00%
[Sample 2] Block: 197000 | synced: false | Memory: 118.01 MB | CPU: 183.00%
[Sample 3] Block: 275000 | synced: false | Memory: 132.96 MB | CPU: 158.00%
[Sample 4] Block: 351000 | synced: false | Memory: 150.52 MB | CPU: 153.00%
[Sample 5] Block: 432000 | synced: false | Memory: 163.67 MB | CPU: 158.00%
[Sample 6] Block: 508000 | synced: false | Memory: 178.56 MB | CPU: 164.00%
[Sample 7] Block: 583000 | synced: false | Memory: 190.03 MB | CPU: 156.00%
[Sample 8] Block: 635000 | synced: false | Memory: 209.26 MB | CPU: 148.00%
[Sample 9] Block: 705000 | synced: false | Memory: 226.96 MB | CPU: 151.00%
[Sample 10] Block: 791000 | synced: false | Memory: 248.11 MB | CPU: 161.00%
[Sample 11] Block: 816000 | synced: false | Memory: 255.58 MB | CPU: 132.00%
[Sample 12] Block: 918000 | synced: false | Memory: 251.60 MB | CPU: 132.00%
synced_to_chain
✓ synced_to_chain: true (block height: 918649)
Stopping LND...
========================================
BENCHMARK RESULTS - v0.18.4-beta
========================================
Execution Time: 70.358434022 seconds
Peak Memory Usage: 454.38 MB
Average Memory Usage: 234.59 MB
Average CPU Usage: 157.61%
Samples Collected: 14
Results saved to: benchmark_v0.18.4-beta.json
Temporary directory will be cleaned up automatically
Cleaning up temporary directory: /tmp/lnd-bench-v0.18.4-beta-2Z0bCV
Waiting 10 seconds before next benchmark...
>>> Running v0.18.5-beta (3/4)...
==========================================
Benchmarking LND v0.18.5-beta
Using temporary storage: /tmp/lnd-bench-v0.18.5-beta-XHFt2i
Resource Limits:
- CPU: 4 cores
- Memory: 8G
==========================================
Starting LND in isolated namespace with tmpfs...
Started LND (PID: 852853)
Monitoring sync status...
To manually check status, run:
sudo nsenter --target 852853 --pid --mount --uts --ipc \
/home/dev/lnd/lncli-debug-v0.18.5-beta --lnddir=/tmp/lnd_data_ns/lnd getinfo
[Sample 0] Block: unknown | synced: false | Memory: 352.79 MB | CPU: 0%
[Sample 1] Block: unknown | synced: false | Memory: 475.08 MB | CPU: 244.00%
[Sample 2] Block: 207000 | synced: false | Memory: 116.33 MB | CPU: 180.00%
[Sample 3] Block: 256000 | synced: false | Memory: 138.87 MB | CPU: 144.00%
[Sample 4] Block: 347000 | synced: false | Memory: 156.42 MB | CPU: 163.00%
[Sample 5] Block: 434000 | synced: false | Memory: 166.09 MB | CPU: 163.00%
[Sample 6] Block: 514000 | synced: false | Memory: 179.01 MB | CPU: 161.00%
[Sample 7] Block: 571000 | synced: false | Memory: 190.89 MB | CPU: 150.00%
[Sample 8] Block: 635000 | synced: false | Memory: 208.35 MB | CPU: 143.00%
[Sample 9] Block: 721000 | synced: false | Memory: 221.60 MB | CPU: 163.00%
[Sample 10] Block: 795000 | synced: false | Memory: 237.88 MB | CPU: 162.00%
[Sample 11] Block: 854000 | synced: false | Memory: 262.40 MB | CPU: 156.00%
synced_to_chain
✓ synced_to_chain: true (block height: 918649)
Stopping LND...
========================================
BENCHMARK RESULTS - v0.18.5-beta
========================================
Execution Time: 65.119188935 seconds
Peak Memory Usage: 475.08 MB
Average Memory Usage: 230.24 MB
Average CPU Usage: 162.00%
Samples Collected: 13
Results saved to: benchmark_v0.18.5-beta.json
Temporary directory will be cleaned up automatically
Cleaning up temporary directory: /tmp/lnd-bench-v0.18.5-beta-XHFt2i
Waiting 10 seconds before next benchmark...
>>> Running v0.20.0-beta.rc1-after (4/4)...
==========================================
Benchmarking LND v0.20.0-beta.rc1-after
Using temporary storage: /tmp/lnd-bench-v0.20.0-beta.rc1-after-LeHuC4
Resource Limits:
- CPU: 4 cores
- Memory: 8G
==========================================
Starting LND in isolated namespace with tmpfs...
Started LND (PID: 854003)
Monitoring sync status...
To manually check status, run:
sudo nsenter --target 854003 --pid --mount --uts --ipc \
/home/dev/lnd/lncli-debug-v0.20.0-beta.rc1-after --lnddir=/tmp/lnd_data_ns/lnd getinfo
[Sample 0] Block: unknown | synced: false | Memory: 356.35 MB | CPU: 0%
[Sample 1] Block: unknown | synced: false | Memory: 489.78 MB | CPU: 249.00%
[Sample 2] Block: 207000 | synced: false | Memory: 144.97 MB | CPU: 222.00%
[Sample 3] Block: 278000 | synced: false | Memory: 172.72 MB | CPU: 203.00%
[Sample 4] Block: 355000 | synced: false | Memory: 207.43 MB | CPU: 221.00%
[Sample 5] Block: 436000 | synced: false | Memory: 242.13 MB | CPU: 224.00%
[Sample 6] Block: 510000 | synced: false | Memory: 269.44 MB | CPU: 216.00%
[Sample 7] Block: 577000 | synced: false | Memory: 303.05 MB | CPU: 222.00%
[Sample 8] Block: 657000 | synced: false | Memory: 324.72 MB | CPU: 228.00%
[Sample 9] Block: 721000 | synced: false | Memory: 354.95 MB | CPU: 214.00%
[Sample 10] Block: 771000 | synced: false | Memory: 378.89 MB | CPU: 208.00%
[Sample 11] Block: 795000 | synced: false | Memory: 405.87 MB | CPU: 183.00%
[Sample 12] Block: 872000 | synced: false | Memory: 392.42 MB | CPU: 227.00%
synced_to_chain
✓ synced_to_chain: true (block height: 918649)
Stopping LND...
========================================
BENCHMARK RESULTS - v0.20.0-beta.rc1-after
========================================
Execution Time: 70.416534861 seconds
Peak Memory Usage: 489.78 MB
Average Memory Usage: 308.40 MB
Average CPU Usage: 208.15%
Samples Collected: 14
Results saved to: benchmark_v0.20.0-beta.rc1-after.json
Temporary directory will be cleaned up automatically
Cleaning up temporary directory: /tmp/lnd-bench-v0.20.0-beta.rc1-after-LeHuC4
==========================================
COMPARISON RESULTS
==========================================
Metric | v0.18.3-beta | v0.18.4-beta | v0.18.5-beta | v0.20.0-beta.rc1-after
------------------------- | ----------------- | ----------------- | ----------------- | -----------------
Execution Time (s) | 70.434696628 | 70.358434022 | 65.119188935 | 70.416534861
Peak Memory (MB) | 437.65 | 454.38 | 475.08 | 489.78
Average Memory (MB) | 231.76 | 234.59 | 230.24 | 308.4
Average CPU (%%) | 157.76 | 157.61 | 162 | 208.15
Summary:
--------
✓ Fastest: v0.18.5-beta (65.119188935s)
✓ Lowest Peak Memory: v0.18.3-beta (437.65 MB)
✓ Lowest Average CPU: v0.18.4-beta (157.61%)
Detailed logs available in:
lnd_v0.18.3-beta.log
lnd_v0.18.4-beta.log
lnd_v0.18.5-beta.log
lnd_v0.20.0-beta.rc1-after.log ConclusionsExecution time for v0.20.0-beta.rc1 with the changeset in this PR remains competitive at 70.42 seconds, matching the non-regression performance of v0.18.3-beta and v0.18.4-beta (70.43s and 70.36s respectively) while being only 8.1% slower than v0.18.5-beta (65.12s). Most importantly, this represents a dramatic recovery from the multi-day sync times that plagued v0.19.0-beta, v0.19.1-beta, v0.19.2-beta, v0.19.3-beta, and v0.20.0 current release candidates - what previously took days now completes in just over a minute, making LND v0.20.0 fully operational again. Constraints
Reproducibility
Resources
|
cc @hsjoberg, @shocknet-justin Pushed a new changeset and it will solve the wallet sync issues you experienced and this one as well #9081 |
@mohamedawnallah I think this partially solved the issue, in that lnd reports syncd_to_chain=true after headers complete, which should unblock our deployments, but what I understand to be the rescan is still taking place after headers sync: ![]() |
It took me two hours to identify the underlying root cause. The changeset I submitted was addressing the symptom rather than the core issue. I'll submit a new changeset tomorrow that fully resolves the root cause. Thanks for being responsive during the resolution process! |
// Sync state transitions as dispatcher catches up: | ||
// Gap >10 blocks: synced (ignore dispatcher lag during catchup) | ||
// Gap 1-10 blocks: not synced (strict: must be at exact tip) | ||
// Gap 0 blocks: synced (dispatcher caught up) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blockbeatDispatcher
is receiving blocks and dispatching them to blockbeat
subscribers, I don't think it's safe to assume we've synced. It's likely that one of the subsystems is still busy processing the block, causing the blockbeat
to stall. We should instead identify what's causing the stall fix properly fix it, other than giving user the false impression that lnd
is synced.
Okay today's experiments haven't come to a meaningful conclusion fix. I guess it is more of race condition/not up-to-date height value after the chain notifier moved from its place to there. I am 100% confident that is the main changeset that introduced this regression is this one: 16a8b623b Building ANY LND before that commit will NOT TRIGGER any premature wallet rescanning. Will try tomorrow with different experiments :) |
Change Description
The original regression caused by this changeset 16a8b623b.
Closes #10180.
Closes #9081.
Steps to Test
Steps for reviewers to follow to test the change.
Pull Request Checklist
Testing
Code Style and Documentation
[skip ci]
in the commit message for small changes.📝 Please see our Contribution Guidelines for further guidance.