Add VARIANT_JSON and VARIANT_BINARY client capabilities#29046
Add VARIANT_JSON and VARIANT_BINARY client capabilities#29046
Conversation
c1da882 to
aba2fc7
Compare
aba2fc7 to
1ccf36e
Compare
|
Please don't remove |
There is no relevant issue. Someone modified the message to say it fixes a bug, but there is no bug since variant is unreleased. |
1ccf36e to
308d8d8
Compare
|
Tested with trino cli 468 |
| * Whether client supports the `VARIANT` type encoded as a binary payload on the wire. | ||
| * This capability is opt-in, so clients continue to receive the JSON representation by default. | ||
| */ | ||
| VARIANT_BINARY(false), |
There was a problem hiding this comment.
Is this supposed to be mentioned in docs/src/main/sphinx/develop/client-protocol.md ?
I'm seeing in io.trino.jdbc.TrinoConnection#startQuery that it is being added without any checks.
There was a problem hiding this comment.
I don't see Client-Capabilities documented anywhere
|
Why do we need a second commit? Seems complex so what's the benefit |
308d8d8 to
af02fdb
Compare
This allows to use the full power of VARIANT in JDBC. VARIANT caries type information that JSON does not, and lowering to JSON removes that. For example, you don't know which data contains timestamps |
31e49b9 to
1a6af27
Compare
1a6af27 to
dd25f24
Compare
|
@findepi I rethought the JSON decision. With the redesign to lower VARIANT to JSON for old drivers, we have automatically define the JSON mapping for VARIANT, so that ship sailed. I added JSON support to the JDBC driver with |
findepi
left a comment
There was a problem hiding this comment.
First commit LGTM. Naming comment only (important but small)
Second commit I didn't review throughly, because I don't agree with high level mechanics of the change.
This PR is a RELEASE-BLOCKER, and if I am reading this correctly, the first commit resolve the release blockade. We should merge the first commit, and move the controversial commit into separate PR.
| * Whether client supports the `VARIANT` type encoded as JSON values on the wire. | ||
| * When this capability is not set, the server returns `json` for `VARIANT` columns. | ||
| */ | ||
| VARIANT_JSON, |
There was a problem hiding this comment.
"varian [as] json" can be understood as "send variant values as json type" (what we do for legacy clients) or "send variant values as json objects" (what we do for new clients)
in terms of client capability names, just VARIANT would be a better name IMO
| * Whether client supports the `VARIANT` type encoded as a binary payload on the wire. | ||
| * This capability is opt-in, so clients continue to receive the JSON representation by default. | ||
| */ | ||
| VARIANT_BINARY(false), |
There was a problem hiding this comment.
Why do we need another client capability?
i thought we need two states
- old clients: no capability being sent, we choose backwards compatible behavior
- new clients: capability being sent, we choose the best possible behavior
now we have four states
- old clients: no capability being sent, we choose backwards compatible behavior
- new clients, sending VARIANT_JSON or VARIANT_BINARY (or both)
| public boolean enabledByDefault() | ||
| { | ||
| return enabledByDefault; | ||
| } | ||
|
|
||
| public static Set<String> defaultClientCapabilities() | ||
| { |
There was a problem hiding this comment.
This requires documentation what does it mean that client capability is enabled by default.
If this is a JDBC thing, the logic should live in trino-jdbc, not here.
Also, I think this is abusing client capabilities concept.
Client capabilities are meant to describe "what client is capable of [handling]".
Here they seem to used for runtime behavior switching.
It would be much better to keep their original meaning. It should be assumed that latest Trino clients have all the client capabilities defined in the protocol.
Description
Add support for
VARIANTin clients. There are three supported scenarios, based on the settings of client capabilities:JSONwith data encoded asJSONVARIANT_JSON: return typeVARIANTwith data encoded asJSONVARIANT_BINARY: return typeVARIANTwith data in standard binary variant format.The cli uses
VARIANT_JSON, and the JDBC driver usesVARIANT_BINARY.The JDBC client uses binary
VARIANTdirectly without lowering to JSON and losing all type information.VARIANTcolumns can be fetched with:rs.getObject(column)orrs.getObject(column, Object.class)to get the corresponding Java object representation. This is the same asvariant.toObjectrs.getObject(column, io.trino.jdbc.Variant.class)to get anio.trino.jdbc.Variantrs.getObject(column, Map.class)Same asgetObject(Column)but only works when the result would be a Map. Throws an exception if the variant is not an object.rs.getObject(column, List.class)Same asgetObject(Column)but only works when the result would be a Array. Throws an exception if the variant is not an array.rs.getString(column)orrs.getObject(column, String.class)to get the corresponding JSON representation. This is the same asvariant.toJson, which is the same as theVARIANT_JSONencoding.When mapping to Java object representation,
VARIANTvalues map to Java objects as follows:nullBooleanByteShortIntegerLongFloatDoubleBigDecimalStringbyte[]LocalDateLocalTimeInstantLocalDateTimeUUIDList<Object>Map<String, Object>The
io.trino.jdbc.Variantclass is a minimal implementation of VARIANT for decoding data. This supports getting the raw underlying byte arrays for the value and metdata is the client wants to interact with the raw data (storing in a separate system directly without reencoding). The SPI Variant class cannot be used because it is compiled with Java 25.Release notes
(x) Release notes are required, with the following suggested text: