Description
Describe the bug
Hello,
We are receiving a ECONNRESET error about 1-2 times daily when ingesting data into our clickhouse cloud cluster. This error causes our nodejs app to crash because its thrown asynchronously (caught by our unhandledRejection root error handler). This has been occurring non-stop for over a year.
Stack trace:
Error: read ECONNRESET
File "node:internal/stream_base_commons", line 217, col 20, in TLSWrap.onStreamRead
File "node:internal/async_hooks", line 128, col 17, in TLSWrap.callbackTrampoline
Example request payload:
http.method: POST,
http.query: query_id=a448c0c3-badb-4cfd-9719-8545fdf9d880&async_insert=1&async_insert_deduplicate=1&async_insert_busy_timeout_ms=200¶llel_view_processing=1&query=INSERT+INTO+datapoint+FORMAT+JSONEachRow
I tried to look up that query id, but did not find it in the system, so I presume the request fails before/while being processed by clickhouse servers.
Our ingestion implementation is very simple:
try {
await client.insert({
table: "datapoint",
format: "JSONEachRow",
values: [... array of values ...],
clickhouse_settings: {
async_insert: 1,
async_insert_deduplicate: 1,
async_insert_busy_timeout_ms: 1,
parallel_view_processing: 1,
},
});
} catch (error) {
// why the error is not returned here?
... handle error ...
}
One interesting thing to notice is that the requests are taking a very long time (26s ~ 27s) before failing.
Clickhouse support recommended I open an issue here since there's something clearly wrong with the error handling in the clickhouse js client, but did have this to say:
The request did reach the ClickHouse server and completed in ~738ms.
The query type was INSERT INTO datapoint FORMAT JSONEachRow, and it used async insert.
However, it shows that 0 rows were written.
No exceptions were logged (exception_code: 0), which means from ClickHouse's point of view, this was a valid and successful (but empty) insert.
This might explain the behavior you're seein especially the long duration (26 ~ 27 seconds) before the ECONNRESET. It seems likely that the actual HTTP body (with the JSONEachRow payload) wasn't successfully streamed to ClickHouse.
We are investigating http agent settings to reduce the socket errors, but regardless these errors shouldn't be bubbled up as unhandledRejection(s).
Related: 386
Expected behaviour
Socket/ECONNRESET errors to be caught by the try/catch block.
Configuration
Environment
- Client version: 1.11.0
- Language version: nodejs 18.19.0
- OS: Debian 12.10
ClickHouse server
- ClickHouse Server version: 24.10 (clickhouse cloud production service)