Add support for case sensitive identifiers#28331
Add support for case sensitive identifiers#28331prrvchr wants to merge 1 commit intotrinodb:masterfrom
Conversation
Praveen2112
left a comment
There was a problem hiding this comment.
Thanks for working on this. I don't think removing the conversion to lower case alone would fix this issue. We need to introduce a contract between the engine and the connector - as on what would be its stand on case sensitive identifiers and convert them accordingly.
core/trino-main/src/main/java/io/trino/sql/analyzer/StatementAnalyzer.java
Show resolved
Hide resolved
If we want to be able to handle identifiers in mixed cases, then Trino must become strict regarding character case matching on identifier. And trying to offer more than that will only cause problems. No further conversions are performed by Trino. The question is: Which connectors do not support mixed cases in their identifiers? |
There was a problem hiding this comment.
Does Druid supportsMixedCaseIdentifiers?
|
This is backward incompatible change hence for sure won't be accepted. |
|
@wendigo How do you think this will be backward compatible? Trino doesn't differentiate between identifiers: And if it's not going to be published for that reason, then you'll be stuck with issue 17 for a long time. However, I'd like to know what the decision will be, because in that case, I'll fork and maintain only my connectors. |
Trino doesn't have to be strict regarding character case matching on identifier, it just needs to respect the underlying connector's decision. Connector should be able to help on canonicalize the object name accordingly. Lets say for hive connector, it supports all object names is lower case so even it the user tries to run the query like
A good example is hive, which doesn't support mixed case identifiers.
Create views based on JDBC tables and storing them in Hive - is a pretty common use case. Enabling it directly to all the JDBC connector will make the views unusable. |
| if (isDelimited()) { | ||
| return value; | ||
| } | ||
|
|
||
| return value.toUpperCase(ENGLISH); |
There was a problem hiding this comment.
This is essential. As per SQL Spec - only delimited identifiers should be case sensitive i.e
colA == COLA == COlA == cola (or upper case would be default mode)
"colA" == "colA" - i.e case sensitivity is considered only for delimited identifiers.
| throw semanticException(COLUMN_TYPE_UNKNOWN, element, "Unknown type '%s' for column '%s'", column.getType(), name); | ||
| } | ||
| if (columns.containsKey(name.getValue().toLowerCase(ENGLISH))) { | ||
| if (columns.containsKey(name.getValue())) { |
There was a problem hiding this comment.
Name should be canonicalized based on the connectors expectation
There was a problem hiding this comment.
If we want to be able to create tables with columns in mixed case, then this is mandatory.
There was a problem hiding this comment.
Yes - but the logic on whether
name.getValue().toLowerCase(ENGLISH)name.getValue().toUpperCase(ENGLISH)name.getValue()
Should be based on whether the Identifier is delimited and connector support case sensitive identifiers.
@Praveen2112 If we want to maintain compatibility, as requested by @wendigo, then the connectors need to know whether an identifier is delimited or not. Currently, this information hasn't been relayed back to the connectors, it seems to me? Okay, I just opened issue#28356 and I'm going to submit a pull request just to fix this problem. Once that's done, I'll come back here. |
Description
This fix allows Trino to support identifiers in mixed cases.
As a result, Trino has become strict regarding character case matching on identifier.
Additional context and related issues
This is supposed to correct issue#17
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
(x) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: