fix(clickhouse sink): SQL-standard identifier escaping for database and table names#25591
fix(clickhouse sink): SQL-standard identifier escaping for database and table names#25591pront wants to merge 3 commits into
Conversation
…se and table names Double-quote characters in rendered database/table names were either unescaped (database) or escaped as `\"` (table), both of which allow breaking out of the quoted identifier context. Replace with `""` per the SQL standard, which keeps any injected content inside the identifier and causes ClickHouse to reject it as a parse error.
…g fix Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cbc75ed7aa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| database.replace('\\', "\\\\").replace('"', "\"\""), | ||
| table.replace('\\', "\\\\").replace('"', "\"\""), |
There was a problem hiding this comment.
Do not switch ClickHouse identifiers to quote doubling
For ClickHouse database or table names that contain a literal " (including the injection payload this change is meant to neutralize), replacing quotes with "" does not follow ClickHouse's documented quoted-identifier escaping rules: identifiers use the same escapes as string literals, where \" represents a double quote and \\ represents a backslash. The generated INSERT INTO "..."."..." can therefore still be parsed incorrectly by ClickHouse; keep escaping backslashes but escape embedded double quotes with a backslash rather than SQL-standard doubling.
Useful? React with 👍 / 👎.
Summary
The ClickHouse sink was not escaping the
databaseidentifier at all, and was using the wrong escape sequence (\") for thetableidentifier. ClickHouse's double-quoted identifiers follow SQL standard: the only escape is doubling the quote (""). Using\"is interpreted by ClickHouse as an escaped quote inside the identifier rather than a closing delimiter, which means a crafted table or database name could break out of the identifier and inject arbitrary SQL into the INSERT statement.This fix applies the correct escaping to both identifiers:
\→\\first (so a literal backslash isn't mistaken for an escape prefix)"→""(SQL standard)Vector configuration
How did you test this PR?
Added a table-driven unit test (
identifier_escaping) covering:"in table name (doubled)"in database name (doubled)"(backslash doubled first, then quote doubled)Updated existing
encode_validassertions to match the corrected escaping.Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes