Skip to content

Add support for case sensitive identifiers#28331

Open
prrvchr wants to merge 1 commit intotrinodb:masterfrom
prrvchr:fix#17
Open

Add support for case sensitive identifiers#28331
prrvchr wants to merge 1 commit intotrinodb:masterfrom
prrvchr:fix#17

Conversation

@prrvchr
Copy link
Member

@prrvchr prrvchr commented Feb 17, 2026

Description

This fix allows Trino to support identifiers in mixed cases.
As a result, Trino has become strict regarding character case matching on identifier.

Additional context and related issues

This is supposed to correct issue#17

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
(x) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`17`)

@cla-bot cla-bot bot added the cla-signed label Feb 17, 2026
@github-actions github-actions bot added jdbc Relates to Trino JDBC driver iceberg Iceberg connector hive Hive connector druid Druid connector pinot Pinot connector labels Feb 17, 2026
@prrvchr prrvchr requested a review from findepi February 17, 2026 10:57
Copy link
Member

@Praveen2112 Praveen2112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. I don't think removing the conversion to lower case alone would fix this issue. We need to introduce a contract between the engine and the connector - as on what would be its stand on case sensitive identifiers and convert them accordingly.

@prrvchr
Copy link
Member Author

prrvchr commented Feb 17, 2026

I don't think removing the conversion to lower case alone would fix this issue. We need to introduce a contract between the engine and the connector

If we want to be able to handle identifiers in mixed cases, then Trino must become strict regarding character case matching on identifier. And trying to offer more than that will only cause problems. No further conversions are performed by Trino.

The question is: Which connectors do not support mixed cases in their identifiers?

Copy link
Member Author

@prrvchr prrvchr Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Druid supportsMixedCaseIdentifiers?

@wendigo
Copy link
Contributor

wendigo commented Feb 17, 2026

This is backward incompatible change hence for sure won't be accepted.

@prrvchr
Copy link
Member Author

prrvchr commented Feb 17, 2026

@wendigo How do you think this will be backward compatible? Trino doesn't differentiate between identifiers: column, COLUMN, or CoLuMn. If it's asked to differentiate between them, then it can't be backward compatible? Perhaps it's time to consider creating a new branch for this?

And if it's not going to be published for that reason, then you'll be stuck with issue 17 for a long time.

However, I'd like to know what the decision will be, because in that case, I'll fork and maintain only my connectors.
Thank you for giving me a precise answer.

@Praveen2112
Copy link
Member

Trino must become strict regarding character case matching on identifier.

Trino doesn't have to be strict regarding character case matching on identifier, it just needs to respect the underlying connector's decision. Connector should be able to help on canonicalize the object name accordingly. Lets say for hive connector, it supports all object names is lower case so even it the user tries to run the query like columnA or ColumnA or "columnA" - all of them should map to columna. Lets say for a JDBC based data source which supports case sensitive objects, connector should be able to provide how to map the name accordingly. Trino would pass an identifier and an information on whether it is delimited or not, it is the job of the connector to resolve it.

Which connectors do not support mixed cases in their identifiers?

A good example is hive, which doesn't support mixed case identifiers.

How do you think this will be backward compatible?

Create views based on JDBC tables and storing them in Hive - is a pretty common use case. Enabling it directly to all the JDBC connector will make the views unusable.

Comment on lines -81 to -85
if (isDelimited()) {
return value;
}

return value.toUpperCase(ENGLISH);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essential. As per SQL Spec - only delimited identifiers should be case sensitive i.e
colA == COLA == COlA == cola (or upper case would be default mode)
"colA" == "colA" - i.e case sensitivity is considered only for delimited identifiers.

throw semanticException(COLUMN_TYPE_UNKNOWN, element, "Unknown type '%s' for column '%s'", column.getType(), name);
}
if (columns.containsKey(name.getValue().toLowerCase(ENGLISH))) {
if (columns.containsKey(name.getValue())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name should be canonicalized based on the connectors expectation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to be able to create tables with columns in mixed case, then this is mandatory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - but the logic on whether

  • name.getValue().toLowerCase(ENGLISH)
  • name.getValue().toUpperCase(ENGLISH)
  • name.getValue()

Should be based on whether the Identifier is delimited and connector support case sensitive identifiers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

@prrvchr
Copy link
Member Author

prrvchr commented Feb 18, 2026

Create views based on JDBC tables and storing them in Hive - is a pretty common use case. Enabling it directly to all the JDBC connector will make the views unusable.

@Praveen2112 If we want to maintain compatibility, as requested by @wendigo, then the connectors need to know whether an identifier is delimited or not. Currently, this information hasn't been relayed back to the connectors, it seems to me?

Okay, I just opened issue#28356 and I'm going to submit a pull request just to fix this problem. Once that's done, I'll come back here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed druid Druid connector hive Hive connector iceberg Iceberg connector jdbc Relates to Trino JDBC driver pinot Pinot connector

Development

Successfully merging this pull request may close these issues.

3 participants

Comments