Skip to content

graphite-clickhouse doesn't close connection to clickhouse when request was canceled #316

@perederyaev

Description

@perederyaev

Hi!
I see a lot of long-running (up to 1 minute) requests from graphite-clickhouse (v0.13.4) in our clickhouse.

For reproducing the issue I make request to graphite-clickhouse with curl:

curl 'localhost:6000/render/?format=pickle&target= _________&from=1748882404&until=1749757204' -v -m2
*   Trying 127.0.0.1:6000...
* Connected to localhost (127.0.0.1) port 6000 (#0)
> GET /render/?format=pickle&target= _________&from=1748882404&until=1749757204 HTTP/1.1
> Host: localhost:6000
> User-Agent: curl/7.76.1
> Accept: */*
>
* Operation timed out after 2004 milliseconds with 0 bytes received
* Closing connection 0
curl: (28) Operation timed out after 2004 milliseconds with 0 bytes received

curl ends in 2 second and I see "context canceled" in the graphite-clickhouse log:

[2025-06-12T23:21:50.643+0300] INFO [render] query {"request_id": "968928b6d1eefe1cc2ce18b262bfa9a8", "query": "SELECT Path FROM graphite_index WHERE ((Level=4) AND (Path LIKE 'rb.rb%' AND match(Path, '_________'))) AND (Date >='2025-06-02' AND Date <= '2025-06-12') GROUP BY Path FORMAT TabSeparatedRaw", "total_rows_to_read": "7909864", "read_rows": "7909864", "read_bytes": "416941622", "written_rows": "0", "written_bytes": "0", "query_id": "968928b6d1eefe1cc2ce18b262bfa9a8::d9e0535b5257f227", "time": 0.132063994}
[2025-06-12T23:21:50.646+0300] INFO [render] finder {"request_id": "968928b6d1eefe1cc2ce18b262bfa9a8", "metrics": 6972, "find_cached": false}
[2025-06-12T23:21:52.513+0300] ERROR [render] data_parser {"request_id": "968928b6d1eefe1cc2ce18b262bfa9a8", "error": "context canceled", "read_bytes": 0, "runtime": "0s", "runtime_ns": 0}
[2025-06-12T23:21:52.513+0300] INFO [http] access {"request_id": "968928b6d1eefe1cc2ce18b262bfa9a8", "time": 2.002144606, "method": "GET", "url": "/render/?format=pickle&target=rb.rb*.rbmon.*&from=1748882404&until=1749757204", "peer": "[::1]:60226", "client": "", "status": 504, "find_cached": false}
[2025-06-12T23:21:57.802+0300] ERROR [render] reader {"request_id": "968928b6d1eefe1cc2ce18b262bfa9a8", "error": "context canceled"}

But query for graphite_reverse table in clickhouse continues to run up to 1 minute and ends with clickhouse error "Code: 24. DB::Exception: Cannot write to ostream at offset 18170421".

I've done some debugging and found that graphite-clickhouse doesn't close connection to clickhouse when request was canceled (curl finishes running) and waits for data-timeout (in my case, data-timeout = "1m0s") and only then disconnects from clickhouse.

That leads to many orphans queries in clickhouse and spare load on it.

Could you please fix the issue or provide any workaround?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions