Skip to content

Daemon incorrectly says it's synced to the network immediately after bootstrap #18194

@cjjdespres

Description

@cjjdespres

Preliminary Checks

Description

After the daemon finishes bootstrapping from genesis:

{"timestamp":"2025-12-03 00:59:23.218172Z","level":"Info","source":{"module":"Bootstrap_controller","location":"File \"src/lib/bootstrap_controller/bootstrap_controller.ml\", line 689, characters 2-13"},"message":"Bootstrap completed in $time_elapsed: $bootstrap_stats","metadata":{"bootstrap_stats":[{"cycle_result":"success","sync_ledger_time":46.8419623374939,"staged_ledger_data_download_time":12.495678663253784,"staged_ledger_construction_time":36.01002883911133,"local_state_sync_required":true,"local_state_sync_time":88.06876564025879}],"host":"<me>","peer_id":"<me>","port":8302,"time_elapsed":"214.918941 seconds"}}
{"timestamp":"2025-12-03 00:59:23.218221Z","level":"Info","source":{"module":"Transition_router","location":"File \"src/lib/transition_router/transition_router.ml\", line 111, characters 2-17"},"message":"Starting transition frontier controller phase","metadata":{"host":"<me>","peer_id":"<me>","port":8302},"event_id":"21ccae8c619bc2666474085272d5fe1d"}

it will immediately report that it's synced:

{"timestamp":"2025-12-03 00:59:23.218332Z","level":"Info","source":{"module":"Mina_lib","location":"File \"src/lib/mina_lib/mina_lib.ml\", line 559, characters 20-35"},"message":"Mina daemon is synced","metadata":{"host":"<me>","peer_id":"<me>","port":8302},"event_id":"cb082b7eafe2a7ca2c6c1acba6664251"}

The code that does this is currently here:

| Some (_, catchup_jobs) ->
let logger = Logger.create () in
if catchup_jobs > 0 then (
[%str_log info] Ledger_catchup ;
`Catchup )
else (
[%str_log info] Synced ;
`Synced ) ) )

in the sync status observer. This is incorrect - the daemon still has a lot of catchup work to do at this point, and it ought to know that it does, because the bootstrap process will have dumped a whole lot of catchup work into the catchup scheduler. As the daemon continues to run, it will in fact do all of this catchup work.

This happens because:

  1. The Catchup_jobs module is supposed to track the the current number of catchup jobs that have been scheduled, via the global reader and writer broadcast pipe pair that it contains.
  2. The sync status observer reacts to changes in the Catchup_jobs.reader end of the pipe, so it will recompute the sync status whenever that number changes. It will also use the value that it read to determine if the daemon is `Synced or still in `Catchup.
  3. The methods Catchup_jobs.incr, Catchup_jobs.decr, and Catchup_jobs.update are completely unused in this code base. Thus the value in Catchup_jobs.reader will start at zero and never change.

The end result is that the sync status observer will always think that there are zero catchup jobs, and so the sync status observer's value will never be `Catchup.

This code was not always dead - the original "normal" catchup implementation did call incr and decr. That implementation was removed in 3ebc296 because super catchup had been the default catchup method for a long time at that point. You can see the incr/decr calls in the removed normal_catchup.ml file.

This bug also affects the graphql newSyncUpdate endpoint, because it uses changes in the value of the sync status observer to determine when to emit sync status updates.

Interestingly, it only partially affects mina client status. This is because of this code:

| `Synced | `Catchup ->
if
(Mina_lib.config t).demo_mode
|| abs (!max_block_height - blockchain_length) < sync_lag
then `Active `Synced
else `Active `Catchup

You can see that if the daemon's best tip has a height that's >= 5 away from the greatest height it's seen so far (more or less) then it'll override what the sync status observer says and just say it's in `Catchup regardless. This discrepancy is visible in the following mina client status excerpt:

Sync status:                                   Synced
Catchup status:                                
	To build breadcrumb:           1
	To initial validate:           0
	Finished:                      294
	To download:                   0
	Waiting for parent to finish:  0
	To verify:                     0

Block producers running:                       0
Coinbase receiver:                             Block producer
Best tip consensus time:                       epoch=40, slot=3324
Best tip global slot (across all hard-forks):  734784
Consensus time now:                            epoch=40, slot=3329

I'm pretty sure the status should have said "Catchup" here (in an ideal world) because we still needed to build one breadcrumb in the catchup table.

Steps to Reproduce

  1. Start a daemon with a fresh config directory, and have it connect to devnet (for instance - mainnet works as well)
  2. Wait for it to finish bootstrapping and initializing the frontier
  3. Observe that it immediately says "Mina daemon is synced"
  4. Observe that there is no message "Mina daemon is doing ledger catchup" in the logs
  5. Observe all of the catchup/super catchup messages in the logs after this point, and also observe that in mina client status the Catchup status: table has a lot of non-zero entries everywhere.

Expected Result

The daemon shouldn't report itself synced instantly after bootstrap when it knows it needs to catch up

Actual Result

The daemon does in fact report itself synced instantly when it knows it needs to catch up

Daemon version

Commit 42ec1eb on compatible, but I'm pretty sure this has been broken for quite a long time

How frequently do you see this issue?

Frequently

What is the impact of this issue on your ability to run a node?

Low

Status

.

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions