Skip to content

Failed to receive a message from Overseer: Signal channel is terminated and empty. #2540

@ioannist

Description

@ioannist

Moonbeam-skylake 0.33, operating as a full-node

I am running a fullnode and querying it extensively in localhost. After approx 70 blocks, the moonbeam node crashes with:

Oct 28 05:27:55 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:55 [Relaychain] ✨ Imported #17915340 (0x8d94…96f5)
Oct 28 05:27:55 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:55 [Relaychain] 💤 Idle (6 peers), best: #17915340 (0x8d94…96f5), finalized #17915337 (0x2545…00b5), ⬇ 30.2kiB/s ⬆ 7.0kiB/s
Oct 28 05:27:55 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:55 [🌗] ⚙️  Preparing  0.0 bps, target=#4743523 (11 peers), best: #4743508 (0x95c1…3d43), finalized #4743505 (0x7bff…d019), ⬇ 4.9kiB/s ⬆ 89 B/s
Oct 28 05:27:56 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:56 [Relaychain] cannot query the runtime API version: Api called for an unknown Block: State already discarded for 0x5adec8fe76ac16a0e2ff5bb1333dab8d683b67ab6fbda537c577511b3d8c511b
Oct 28 05:27:56 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:56 [Relaychain] Failed to fetch runtime API data for job err=NotSupported { runtime_api_name: "validator_groups" }
Oct 28 05:27:56 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:56 [Relaychain] cannot query the runtime API version: Api called for an unknown Block: State already discarded for 0x5adec8fe76ac16a0e2ff5bb1333dab8d683b67ab6fbda537c577511b3d8c511b
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] Failed to receive a message from Overseer, exiting err=Generated(Context("Signal channel is terminated and empty."))
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] err=Subsystem(Generated(Context("Signal channel is terminated and empty.")))
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] error receiving message from subsystem context: Generated(Context("Signal channel is terminated and empty.")) err=Generated(Context("Signal channel is terminated and empty."))
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="statement-distribution-subsystem" err=FromOrigin { origin: "statement-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="network-bridge-rx-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="dispute-distribution-subsystem" err=FromOrigin { origin: "dispute-distribution", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="availability-recovery-subsystem" err=FromOrigin { origin: "availability-recovery", source: Generated(Context("Signal channel is terminated and empty.")) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="bitfield-signing-subsystem" err=FromOrigin { origin: "bitfield-signing", source: Generated(Context("Signal channel is terminated and empty.")) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="candidate-validation-subsystem" err=FromOrigin { origin: "candidate-validation", source: Generated(Context("Signal channel is terminated and empty.")) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="provisioner-subsystem" err=FromOrigin { origin: "provisioner", source: OverseerExited(Generated(Context("Signal channel is terminated and empty."))) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="network-bridge-tx-subsystem" err=FromOrigin { origin: "network-bridge", source: SubsystemError(Generated(Context("Signal channel is terminated and empty."))) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] subsystem exited with error subsystem="chain-api-subsystem" err=FromOrigin { origin: "chain-api", source: Generated(Context("Signal channel is terminated and empty.")) }
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] Overseer exited with error err=Generated(SubsystemStalled("approval-distribution-subsystem"))
Oct 28 05:27:58 stakebaby-chalandri moonbeam[2508374]: 2023-10-28 05:27:58 [Relaychain] Essential task `overseer` failed. Shutting down service.

This appears to be heavy-load or concurrency related, because the node does not crash if I ease down on the query rate. To put this in context, the node is queried by 7-12 NodeJS processes, each one of which can execute up to 300 queries concurrently. The moonbeam process averages 200%-450% of logical core capacity. I have tried different block spans, and the error persists, so I don't think it's related to db corruption.

Looks like it's a polkadot issue, but I am not sure if it has been resolved or ignored.
paritytech/polkadot#6624

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions