Skip to content

nodeos randomly becomes unresponsive, consumes CPU #2230

@centromere

Description

@centromere

Hello. I am running nodeos 5.0.0, compiled with GCC 11 and LLVM 11. Randomly throughout the day the node will stop responding to certain HTTP requests (but not others), and it also stops responding to most unix signals:

$ curl -q http://[...]:8888
{"code":404,"message":"Not Found","error":{"code":0,"name":"exception","what":"unspecified","details":[{"message":"Unknown Endpoint","file":"beast_http_session.hpp","line_number":185,"method":"handle_request"}]}}
# curl -v http://[...]:8888/v1/chain/get_info
*   Trying ...:8888...
* Connected to ... (...) port 8888 (#0)
> GET /v1/chain/get_info HTTP/1.1
> Host: [...]:8888
> User-Agent: curl/7.81.0
> Accept: */*
>
(Timeout occurs)

Here are some recent logs from stdout:

info  2024-02-08T20:40:17.556 nodeos    producer_plugin.cpp:743       on_incoming_block    ] Received block 2020d6baa648f43f... #46431000 @ 2019-03-07T22:35:57.000 signed by eosbeijingbp [trxs: 13, lib: 46430669, confirmed: 0, net: 2776, cpu: 10797, elapsed: 4714, time: 6961, latency: 155426660555 ms]
info  2024-02-08T20:40:17.556 nodeos    producer_plugin.cpp:743       on_incoming_block    ] Received block 2020d6baa648f43f... #46431000 @ 2019-03-07T22:35:57.000 signed by eosbeijingbp [trxs: 13, lib: 46430669, confirmed: 0, net: 2776, cpu: 10797, elapsed: 4714, time: 6961, latency: 155426660555 ms]
info  2024-02-08T20:40:29.275 nodeos    producer_plugin.cpp:743       on_incoming_block    ] Received block 9ef53246f7cf0af9... #46432000 @ 2019-03-07T22:44:17.000 signed by eosauthority [trxs: 18, lib: 46431665, confirmed: 0, net: 2336, cpu: 9184, elapsed: 6011, time: 9737, latency: 155426172274 ms]
info  2024-02-08T20:40:29.275 nodeos    producer_plugin.cpp:743       on_incoming_block    ] Received block 9ef53246f7cf0af9... #46432000 @ 2019-03-07T22:44:17.000 signed by eosauthority [trxs: 18, lib: 46431665, confirmed: 0, net: 2336, cpu: 9184, elapsed: 6011, time: 9737, latency: 155426172274 ms]
error 2024-02-08T21:12:52.685 net-5     net_plugin.cpp:3006           operator()           ] ["peer1.eosphere.io:9876 - 1316ef2" - 5 198.50.156.24:9876] Closing connection
error 2024-02-08T21:12:52.899 net-0     net_plugin.cpp:2976           operator()           ] ["tf-eos-mainnet-api2:9876 - bf80c65" - 3 141.95.255.99:9876] Error reading message: Connection reset by peer
error 2024-02-08T21:12:52.899 net-0     net_plugin.cpp:3006           operator()           ] ["tf-eos-mainnet-api2:9876 - bf80c65" - 3 141.95.255.99:9876] Closing connection
error 2024-02-08T21:12:53.315 net-3     net_plugin.cpp:3006           operator()           ] ["135.148.55.208:9876 - a9df975" - 4 135.148.130.0:9876] Closing connection
error 2024-02-08T21:12:54.443 net-3     net_plugin.cpp:3006           operator()           ] ["api.eossupport.io:9876 - 7d48717" - 2 135.148.168.190:9876] Closing connection
error 2024-02-08T22:33:42.706 net-4     net_plugin.cpp:3006           operator()           ] ["peer1.eosphere.io:9876 - 1316ef2" - 5 198.50.156.24:9876] Closing connection
error 2024-02-08T22:33:43.074 net-6     net_plugin.cpp:2976           operator()           ] ["tf-eos-mainnet-api2:9876 - bf80c65" - 3 141.95.255.99:9876] Error reading message: Connection reset by peer
error 2024-02-08T22:33:43.074 net-6     net_plugin.cpp:3006           operator()           ] ["tf-eos-mainnet-api2:9876 - bf80c65" - 3 141.95.255.99:9876] Closing connection
error 2024-02-08T22:33:43.362 net-5     net_plugin.cpp:3006           operator()           ] ["135.148.55.208:9876 - a9df975" - 4 135.148.130.0:9876] Closing connection
error 2024-02-08T22:33:44.490 net-7     net_plugin.cpp:3006           operator()           ] ["api.eossupport.io:9876 - 7d48717" - 2 135.148.168.190:9876] Closing connection
error 2024-02-08T23:34:53.967 net-7     net_plugin.cpp:3006           operator()           ] ["tf-eos-mainnet-api2:9876 - bf80c65" - 3 141.95.255.99:9876] Closing connection
error 2024-02-08T23:34:54.523 net-5     net_plugin.cpp:3006           operator()           ] ["api.eossupport.io:9876 - 7d48717" - 2 135.148.168.190:9876] Closing connection

nodeos is being invoked in this manner:

nodeos
--data-dir=/var/db/leap
--config-dir=/etc/leap/config
--plugin=eosio::chain_plugin
--plugin=eosio::chain_api_plugin
--protocol-features-dir=/etc/leap/protocol_features
--wasm-runtime=eos-vm-jit
--abi-serializer-max-time-ms=10000
--chain-state-db-size-mb=32768
--chain-threads=16
--eos-vm-oc-compile-threads=4
--disable-replay-opts
--plugin=eosio::http_plugin
--http-server-address=[::]:8888
--http-max-response-time-ms=1000
--verbose-http-errors
--http-validate-host=false
--http-threads=16
--plugin=eosio::net_plugin
--plugin=eosio::net_api_plugin
--p2p-listen-endpoint=:9876
--p2p-server-address=:9876
--p2p-peer-address=peer.main.alohaeos.com:9876
--p2p-peer-address=p2p.genereos.io:9876
--p2p-peer-address=mainnet.eosamsterdam.net:9876
--p2p-peer-address=peer1.eosphere.io:9876
--p2p-peer-address=p2p.eossupport.io:9876
--net-threads=8
--sync-fetch-span=5000
--plugin=eosio::producer_plugin
--plugin=eosio::producer_api_plugin
--snapshots-dir=/var/db/leap/snapshots
--plugin=eosio::state_history_plugin
--state-history-dir=/var/db/leap/block/state-history
--trace-history
--chain-state-history
--state-history-endpoint=[::]:8080
--trace-history-debug-mode
--logconf=/etc/leap/config/logging.json
# free -m
               total        used        free      shared  buff/cache   available
Mem:          515637      467299       28689       17543       19649       24927
Swap:              0           0           0

Does anyone know what could be going wrong?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions