Skip to content

fix(clickhouse sink): SQL-standard identifier escaping for database and table names#25591

Open
pront wants to merge 3 commits into
masterfrom
fix/clickhouse-identifier-escaping
Open

fix(clickhouse sink): SQL-standard identifier escaping for database and table names#25591
pront wants to merge 3 commits into
masterfrom
fix/clickhouse-identifier-escaping

Conversation

@pront

@pront pront commented Jun 8, 2026

Copy link
Copy Markdown
Member

Summary

The ClickHouse sink was not escaping the database identifier at all, and was using the wrong escape sequence (\") for the table identifier. ClickHouse's double-quoted identifiers follow SQL standard: the only escape is doubling the quote (""). Using \" is interpreted by ClickHouse as an escaped quote inside the identifier rather than a closing delimiter, which means a crafted table or database name could break out of the identifier and inject arbitrary SQL into the INSERT statement.

This fix applies the correct escaping to both identifiers:

  1. Escape \\\ first (so a literal backslash isn't mistaken for an escape prefix)
  2. Escape """ (SQL standard)

Vector configuration

sinks:
  my_sink:
    type: clickhouse
    inputs: [...]
    endpoint: http://localhost:8123
    database: my_database
    table: my_table

How did you test this PR?

Added a table-driven unit test (identifier_escaping) covering:

  • Plain identifiers (pass through unchanged)
  • " in table name (doubled)
  • " in database name (doubled)
  • Injection payload in database name (breakout neutralised)
  • Backslash followed by " (backslash doubled first, then quote doubled)

Updated existing encode_valid assertions to match the corrected escaping.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

…se and table names

Double-quote characters in rendered database/table names were either
unescaped (database) or escaped as `\"` (table), both of which allow
breaking out of the quoted identifier context. Replace with `""` per
the SQL standard, which keeps any injected content inside the
identifier and causes ClickHouse to reject it as a parse error.
@github-actions github-actions Bot added the domain: sinks Anything related to the Vector's sinks label Jun 8, 2026
pront and others added 2 commits June 8, 2026 09:26
…g fix

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ment

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pront pront marked this pull request as ready for review June 9, 2026 13:49
@pront pront requested a review from a team as a code owner June 9, 2026 13:49

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cbc75ed7aa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +149 to +150
database.replace('\\', "\\\\").replace('"', "\"\""),
table.replace('\\', "\\\\").replace('"', "\"\""),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Do not switch ClickHouse identifiers to quote doubling

For ClickHouse database or table names that contain a literal " (including the injection payload this change is meant to neutralize), replacing quotes with "" does not follow ClickHouse's documented quoted-identifier escaping rules: identifiers use the same escapes as string literals, where \" represents a double quote and \\ represents a backslash. The generated INSERT INTO "..."."..." can therefore still be parsed incorrectly by ClickHouse; keep escaping backslashes but escape embedded double quotes with a backslash rather than SQL-standard doubling.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: sinks Anything related to the Vector's sinks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant