Skip to content

ExecuteNonQueryAsync silently treats HTTP 200 with mid-stream exception as success #333

@claude

Description

@claude

Description

When the ClickHouse server returns HTTP 200 OK while reporting an error (mid-stream exception, distributed-DDL error blocks, etc.), ClickHouseClient.ExecuteNonQueryAsync silently treats the response as success. Two factors combine to produce this:

  1. ClickHouseClient.HandleError (ClickHouse.Driver/ClickHouseClient.cs:889-901) decides success/failure purely from response.IsSuccessStatusCode. It never inspects X-ClickHouse-Exception-Code (or X-ClickHouse-Exception-Tag) on the response. Any 2xx response — even one carrying an error indicator in the headers — is treated as a success and returned to the caller.

  2. ExecuteNonQueryAsync (ClickHouse.Driver/ClickHouseClient.cs:208-218) does not run the body through the existing mid-stream exception detection. It opens the raw response stream and reads a single 7-bit-encoded int as the affected-rows count:

    var response = await PostSqlQueryAsync(sql, parameters, options, cancellationToken).ConfigureAwait(false);
    using var reader = new ExtendedBinaryReader(await response.HttpResponseMessage.Content.ReadAsStreamAsync(cancellationToken).ConfigureAwait(false));
    
    return reader.PeekChar() != -1 ? reader.Read7BitEncodedInt() : 0;

    The ExceptionTagAwareStream wrapper that detects __exception__ markers when X-ClickHouse-Exception-Tag is present (added in Add support for mid-stream exception detection via X-ClickHouse-Exception-Tag header #168) is only wired up inside ClickHouseDataReader.FromHttpResponseAsync. ExecuteNonQueryAsync bypasses it entirely.

Net effect: a DDL/INSERT that the server reports as failed via HTTP 200 + exception headers + body-encoded error is observable by the application as success. This risks silent schema-change failure or silent data loss.

The ClickHouseDataReader path is partially protected because it wraps the stream in ExceptionTagAwareStream when X-ClickHouse-Exception-Tag is present, but it still does not consult X-ClickHouse-Exception-Code and will not detect cases where the server has encoded the error inside a valid Native/RowBinary block (no __exception__ marker), e.g. the distributed-DDL scenario from clickhouse-go#1398.

ClickHouse server version

26.4.2.10 (local server). Verified against this server that HTTP 200 responses with X-ClickHouse-Exception-Tag (and an error body) are produced when streaming has begun before the error is detected — for example:

curl -sv 'http://localhost:8123/?query=SELECT+throwIf(number=100000,+%27boom%27)+FROM+system.numbers+LIMIT+100000000+FORMAT+RowBinary&http_response_buffer_size=0&max_block_size=10000'
< HTTP/1.1 200 OK
< Transfer-Encoding: chunked
< X-ClickHouse-Exception-Tag: pdqzvcyuuqsybmvs

Note that for this mid-stream case the server emits X-ClickHouse-Exception-Tag but not X-ClickHouse-Exception-Code; for fully buffered failures it emits both alongside HTTP 500. The cross-shard distributed-DDL case (clickhouse-go#1398) produces an HTTP 200 response whose body encodes the error inside a Native block, with no __exception__ marker — this is not reproducible against a single-node server.

Reproduction

The HTTP-status-only check is straightforward to demonstrate without a server using a stub that mirrors what the server emits during a mid-stream exception. Add the following to ClickHouse.Driver.Tests (alongside MidStreamExceptionTests):

using System.Net;
using System.Net.Http;
using System.Threading.Tasks;
using ClickHouse.Driver;
using NUnit.Framework;

namespace ClickHouse.Driver.Tests.ADO;

public class ExecuteNonQuerySilentExceptionTests
{
    private sealed class StubHandler : HttpMessageHandler
    {
        protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, System.Threading.CancellationToken ct)
        {
            // Mimic the server's mid-stream exception: HTTP 200 + exception headers + body.
            var resp = new HttpResponseMessage(HttpStatusCode.OK)
            {
                Content = new ByteArrayContent(System.Text.Encoding.UTF8.GetBytes(
                    "Code: 44. DB::Exception: Sorting key contains nullable columns ... (ILLEGAL_COLUMN)"))
            };
            resp.Headers.TryAddWithoutValidation("X-ClickHouse-Exception-Code", "44");
            resp.Headers.TryAddWithoutValidation("X-ClickHouse-Exception-Tag", "abc");
            return Task.FromResult(resp);
        }
    }

    [Test]
    public async Task ExecuteNonQueryAsync_WhenHttp200ButExceptionHeaderSet_ShouldThrow()
    {
        var settings = new ClickHouseClientSettings("Host=localhost")
        {
            HttpClient = new HttpClient(new StubHandler()),
        };
        using var client = new ClickHouseClient(settings);

        Assert.ThrowsAsync<ClickHouseServerException>(
            async () => await client.ExecuteNonQueryAsync("CREATE TABLE t ON CLUSTER c (...) ENGINE = ReplicatedMergeTree PRIMARY KEY (a)"));
    }
}

Expected: ExecuteNonQueryAsync throws ClickHouseServerException (the server reported an exception via header).
Actual: ExecuteNonQueryAsync returns 0 and no exception is raised.

The same shape applies to the streaming reader path for the distributed-DDL case where the body is a valid Native block whose rows encode the error text — ClickHouseDataReader decodes the rows without seeing an __exception__ marker, and X-ClickHouse-Exception-Code is not consulted.

Suggested fix

  1. In ClickHouseClient.HandleError, after the IsSuccessStatusCode short-circuit, also inspect the response for X-ClickHouse-Exception-Code (or X-ClickHouse-Exception-Tag from Add support for mid-stream exception detection via X-ClickHouse-Exception-Tag header #168). If either is present with a non-empty value, treat the response as an error regardless of status code: read the body and throw a ClickHouseServerException with the parsed code/message. This single change covers ExecuteNonQueryAsync, ExecuteScalarAsync, ExecuteReaderAsync, ExecuteRawResultAsync, PostStreamAsync, PingAsync, and the insert/bulk paths because they all route through HandleError.
  2. Optionally, default wait_end_of_query=1 for the Exec/Ping paths (where streaming the response is not needed) so the server emits a proper non-2xx status when an exception is detected before the body is flushed.
  3. Add the unit test above to ClickHouse.Driver.Tests/ADO/MidStreamExceptionTests.cs (it does not require a running server).

Link

Related upstream issue: ClickHouse/clickhouse-rs#255
Related sibling-client manifestation: ClickHouse/clickhouse-go#1398

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions