-
Notifications
You must be signed in to change notification settings - Fork 14.9k
feat(clickhouse): allow dynamic schema #32610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 1 commit
d48fde8
5c5428f
b37ed36
611e687
b88d8ee
72d8e6e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,6 +20,7 @@ | |
import re | ||
from datetime import datetime | ||
from typing import Any, cast, TYPE_CHECKING | ||
from urllib import parse | ||
|
||
from flask import current_app | ||
from flask_babel import gettext as __ | ||
|
@@ -267,6 +268,8 @@ class ClickHouseConnectEngineSpec(BasicParametersMixin, ClickHouseEngineSpec): | |
parameters_schema = ClickHouseParametersSchema() | ||
encryption_parameters = {"secure": "true"} | ||
|
||
supports_dynamic_schema = False | ||
|
||
@classmethod | ||
def get_dbapi_exception_mapping(cls) -> dict[type[Exception], type[Exception]]: | ||
return {} | ||
|
@@ -414,3 +417,15 @@ def _mutate_label(label: str) -> str: | |
:return: Conditionally mutated label | ||
""" | ||
return f"{label}_{md5_sha_from_str(label)[:6]}" | ||
|
||
@classmethod | ||
def adjust_engine_params( | ||
cls, | ||
uri: URL, | ||
connect_args: dict[str, Any], | ||
catalog: str | None = None, | ||
schema: str | None = None, | ||
) -> tuple[URL, dict[str, Any]]: | ||
if schema: | ||
uri = uri.set(database=parse.quote(schema, safe="")) | ||
Comment on lines
+429
to
+430
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Over-aggressive URL encoding of schema names
Tell me moreWhat is the issue?The schema is being URL encoded without preserving any safe characters, which could lead to compatibility issues with certain schema names containing valid URL characters. Why this mattersEncoding all characters in the schema name could make certain valid schema names unusable and prevent connections to databases with schema names containing standard URL-safe characters. Suggested change ∙ Feature PreviewModify the code to preserve standard URL-safe characters: if schema:
uri = uri.set(database=parse.quote(schema, safe="/_-.")) 💬 Looking for more details? Reply to this comment to chat with Korbit. |
||
return uri, connect_args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unvalidated schema name in database connection
Tell me more
What is the issue?
The schema parameter is directly used in a database connection URI with only URL encoding but no input validation.
Why this matters
Without proper validation, malicious schema names could potentially be used for SQL injection or path traversal attacks depending on how ClickHouse handles database names.
Suggested change ∙ Feature Preview
Add input validation before using the schema parameter:
💬 Looking for more details? Reply to this comment to chat with Korbit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valid concern but looking at other engine_specs like Hive, schema value is also not validated. Shall we follow the norm or just add the validation for only clickhouse?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that other
engine_specs
like Hive might not verify their schema values, but I believe it would still be good practice to include the validation for ClickHouse in this case. It could help prevent potential SQL injection or path traversal attacks in the future. Even better, we could potentially update allengine_specs
to include this validation. I agree with your concern though - should we implement this only for ClickHouse or consider it for allengine_specs
?