fix(csharp): fix SEA protocol parity issues for metadata, errors, and catalog handling by msrathore-db · Pull Request #338 · adbc-drivers/databricks

msrathore-db · 2026-03-12T03:03:10Z

Summary

Fixes multiple SEA (Statement Execution API) driver issues discovered during E2E test parity testing between Thrift and REST protocols.

Error Handling

HTTP status code mapping: Map 401/403 → AdbcStatusCode.Unauthorized, 404 → NotFound, 400 → InvalidArgument in StatementExecutionClient.EnsureSuccessStatusCodeAsync
Connection error wrapping: Wrap non-ADBC exceptions (DNS failures, etc.) in DatabricksException during StatementExecutionConnection.OpenAsync

Catalog Handling

SPARK catalog alias: Apply HandleSparkCatalog (SPARK → null) at SEA connection setup, matching Thrift behavior
EffectiveCatalog rewrite: When SPARK is explicitly requested with enableMultipleCatalogSupport=true, pass null through to mean "all catalogs" instead of falling back to connection default. Verified via comparator across all 24 catalog/flag scenarios
GetCatalogs filter removal: Remove catalog pattern filter from GetCatalogsAsync — Thrift GetCatalogs RPC has no catalog filter
SHOW SCHEMAS column order fix: Server returns (databaseName, catalog) not (catalog, databaseName) — fixed in both StatementExecutionStatement and StatementExecutionConnection

Metadata Fixes

Lowercase GetObjects patterns: Lowercase catalog/schema/table/column patterns in GetObjects for case-insensitive matching (matches DatabricksConnection.GetObjectsPatternsRequireLowerCase = true)
Scale/Precision null-setting: Only retain Scale (XdbcDecimalDigits) for DECIMAL/NUMERIC types. CHAR/VARCHAR retain Precision but not Scale, matching Thrift behavior
Statement default catalog: Set _metadataCatalogName from connection default in constructor, matching Thrift's DatabricksStatement constructor pattern

Statement Options

TraceParent support: Handle AdbcOptions.Telemetry.TraceParent in both StatementExecutionStatement.SetOption and StatementExecutionConnection.SetOption
AffectedRows default: Default ExecuteUpdate row count to -1 (unknown) when manifest has no count, matching Thrift DDL behavior

Test Plan

All 24 catalog behavior scenarios verified via custom comparator (Thrift vs SEA)
Driver builds successfully
Full E2E test suite (requires companion test changes in separate PR)

🤖 Generated with Claude Code

… catalog handling Fixes multiple SEA driver issues found during E2E test parity testing with Thrift protocol: - Map HTTP 401/403 to AdbcStatusCode.Unauthorized in SEA client - Wrap non-ADBC exceptions in DatabricksException during OpenAsync - Handle SPARK catalog alias (SPARK → null) at connection setup, matching Thrift's HandleSparkCatalog behavior - Fix EffectiveCatalog to pass null through (all catalogs) when SPARK is explicitly requested, instead of falling back to connection default - Fix SHOW SCHEMAS IN ALL CATALOGS column order (databaseName, catalog) not (catalog, databaseName) as previously assumed - Remove catalog filter from GetCatalogs (Thrift has no filter either) - Lowercase GetObjects patterns for case-insensitive matching - Fix Scale/Precision null-setting: null Scale for CHAR/VARCHAR types (only DECIMAL/NUMERIC should retain Scale) - Add SetOption support for TraceParent on both connection and statement - Default ExecuteUpdate AffectedRows to -1 when manifest has no count - Set _metadataCatalogName from connection default in statement constructor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

csharp/src/StatementExecution/StatementExecutionConnection.cs

eric-wang-1990 · 2026-03-12T03:44:20Z

csharp/src/StatementExecution/StatementExecutionConnection.cs

-                            && typeCode != (short)HiveServer2Connection.ColumnTypeId.NVARCHAR
-                            && typeCode != (short)HiveServer2Connection.ColumnTypeId.LONGVARCHAR
-                            && typeCode != (short)HiveServer2Connection.ColumnTypeId.LONGNVARCHAR)
+                        bool isDecimalOrNumeric = typeCode == (short)HiveServer2Connection.ColumnTypeId.DECIMAL


Why we need to set these to null after PopulateTableInfoFromTypeName is already processed?
Can we just set everything right in PopulateTableInfoFromTypeName?

Valid suggestion. PopulateTableInfoFromTypeName sets non-null values for ALL types (e.g., Scale=0 for BOOLEAN, Scale=7 for FLOAT). But Thrift's SparkConnection.SetPrecisionScaleAndTypeName only sets Precision/Scale for DECIMAL/NUMERIC/CHAR/VARCHAR types — everything else gets null.

The post-processing null-setting was the original design to match Thrift parity. The cleaner fix would be to update PopulateTableInfoFromTypeName itself, but it's in ColumnMetadataHelper which is also used by FlatColumnsResultBuilder (GetColumns path) where the values ARE needed for all types. So the nulling has to happen at the GetObjects call site, not inside the shared helper.

We could extract this into a separate helper method called after PopulateTableInfoFromTypeName for clarity, but moving the logic inside would change behavior for the GetColumns path.

I still wanna revisit this.
I think we should fix this function to only return scale for decimal there => Currently it seems the FlatColumnsResultBuilder return scale for non-decimal fields which seems wrong?
The columnSize seems is already correctly handled.

There is not any usage for the GetObjects call now, so I would like to start correct. The current Thrift GetObjects has NO Databricks customization so it's a very very basic one, I would not mind diverge from that

Fixed in 83a9750. Changed GetDecimalDigitsDefault to return null for all non-DECIMAL/NUMERIC types (was returning 0/7/15/6). Now PopulateTableInfoFromTypeName sets Scale to null from the start for non-DECIMAL types, so the post-processing only needs to null out Precision for non-DECIMAL/CHAR types. This also fixes the FlatColumnsResultBuilder (GetColumns) path which was incorrectly returning Scale values for non-DECIMAL columns.

Further update in c9847c0 — removed the post-processing entirely. GetColumnSizeDefault and GetDecimalDigitsDefault values are now kept for all types in GetObjects, matching GetColumns/GetColumnsExtended. This provides consistent Precision/Scale values across all metadata APIs rather than nulling them for GetObjects only.

eric-wang-1990 · 2026-03-12T03:50:02Z

csharp/src/StatementExecution/StatementExecutionStatement.cs

+        /// resolves to the session default catalog (SEA SQL requires an explicit
+        /// catalog for SHOW commands when not querying all catalogs).
+        /// </summary>
+        private string? EffectiveCatalog


I remember you already have a similar function like this?

Yes — ResolveEffectiveCatalog on the connection (line 793) still exists and is used by GetTableSchema. This new EffectiveCatalog property is on the statement and replaced the old one-liner _connection.ResolveEffectiveCatalog(_metadataCatalogName).

The reason we can't reuse ResolveEffectiveCatalog is that it does normalized ?? _catalog (always falls back to connection default). That's wrong when the user explicitly SetOption(CatalogName, "SPARK") — that should mean "all catalogs" (null), not fall back to default.

The statement's EffectiveCatalog has different semantics:

enableMultipleCatalogSupport=true: passes null through (all catalogs) when SPARK is set

enableMultipleCatalogSupport=false: resolves null to session default via GetSessionDefaultCatalog()

The connection's ResolveEffectiveCatalog is still correct for GetTableSchema where you always need a concrete catalog (you can't do SHOW COLUMNS IN ALL CATALOGS).

Is it just SHOW COLUMNS IN ALL CATALOGS not permitted?
What about SHOW TABLES IN ALL CATALOGS which is called from the statement level?

We support SHOW TABLES IN ALL CATALOGS. For SHOW COLUMNS, the PR is merged in runtime but not yet released

OK so when that is released we can change the connection level ResolveEffectiveCatalog to use this one? Let's add a comment if so.

Good point. Added a TODO in f7694ed. Once the backend supports SHOW COLUMNS IN ALL CATALOGS, we can consolidate. Right now the connection's ResolveEffectiveCatalog is only used by GetTableSchema (which always needs a concrete catalog), so it still needs the fallback. The PopulateColumnInfoAsync path now uses ExecuteShowColumnsAsync which handles null catalog via the iterate-all-catalogs fallback.

eric-wang-1990

Review focused on Thrift/SEA parity.

csharp/src/StatementExecution/StatementExecutionConnection.cs

eric-wang-1990 · 2026-03-12T03:56:35Z

** — null pattern is intentional**

passes null (not _metadataCatalogName) to ShowCatalogsCommand. This matches Thrift behaviour — SHOW CATALOGS lists everything and filtering is applied by the caller. It's easy to read as a copy-paste bug though; a brief inline comment would help future readers.

eric-wang-1990 · 2026-03-12T03:56:42Z

Observation: GetSessionDefaultCatalog() issues a SQL round-trip on every metadata call

GetSessionDefaultCatalog() calls GetCurrentCatalog() which executes SELECT CURRENT_CATALOG() every time. Now that it is explicitly wired from the statement layer (EffectiveCatalog), it fires on every GetObjects/GetTableSchema call when enableMultipleCatalogSupport=false. Not a regression introduced here, but worth a follow-up to lazily cache the value (reset on SetOption).

eric-wang-1990 · 2026-03-12T03:56:47Z

Observation: GetSessionDefaultCatalog() issues a SQL round-trip on every metadata call

GetSessionDefaultCatalog() calls GetCurrentCatalog() which runs SELECT CURRENT_CATALOG() every time it is invoked. Now that it is explicitly wired from the statement layer (EffectiveCatalog property), it will fire on every GetObjects / GetTableSchema call when enableMultipleCatalogSupport=false. This is not a regression introduced here, but worth a follow-up: lazily cache the result on first call and reset the cache on SetOption.

…rrentCatalog Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

GetDecimalDigitsDefault now only returns a value for DECIMAL/NUMERIC. All other types return null, so Scale is correctly null from the start in PopulateTableInfoFromTypeName. This removes the need for post-processing Scale null-setting in PopulateColumnInfoAsync. Simplified the post-processing to only null out Precision for types that don't need it (non-DECIMAL/CHAR). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

SHOW COLUMNS IN ALL CATALOGS is not yet supported by the backend. When catalog is null, iterate over all catalogs via SHOW CATALOGS and call SHOW COLUMNS IN `catalog` for each, merging results. Catalogs that return permission errors are silently skipped. Both GetColumnsAsync (statement) and PopulateColumnInfoAsync (connection/GetObjects) use the shared ExecuteShowColumnsAsync helper. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…umnInfoAsync Keep GetColumnSizeDefault and GetDecimalDigitsDefault values for all types in GetObjects, matching GetColumns/GetColumnsExtended behavior. This diverges from Thrift's basic GetObjects (which nulls Precision/Scale for non-DECIMAL/CHAR types) but provides more useful metadata and consistent values across all metadata APIs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…Catalog Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

csharp/src/StatementExecution/StatementExecutionConnection.cs

… in GetTableSchema GetTableSchema now uses ExecuteShowColumnsAsync with HandleSparkCatalog, matching Thrift behavior which passes catalog as-is to the server (null = all catalogs, SPARK = null). Removed ResolveEffectiveCatalog since it's no longer needed — all catalog resolution now uses either: - Statement's EffectiveCatalog (for metadata commands) - HandleSparkCatalog + ExecuteShowColumnsAsync (for GetTableSchema) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reverts the ?? -1 change back to ?? 0 for both ExecuteQuery and ExecuteUpdate. SEA API returns 0 for DDL, not -1. Keeping original SEA behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

msrathore-db and others added 2 commits March 12, 2026 08:32

merge: resolve conflict with main (EnableComplexDatatypeSupport)

1d03e34

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

msrathore-db had a problem deploying to azure-prod March 12, 2026 03:09 — with GitHub Actions Error

msrathore-db requested a review from eric-wang-1990 March 12, 2026 03:25

eric-wang-1990 reviewed Mar 12, 2026

View reviewed changes

csharp/src/StatementExecution/StatementExecutionConnection.cs Show resolved Hide resolved

eric-wang-1990 reviewed Mar 12, 2026

View reviewed changes

csharp/src/StatementExecution/StatementExecutionConnection.cs Show resolved Hide resolved

csharp/src/StatementExecution/StatementExecutionConnection.cs Show resolved Hide resolved

fix: separate XML doc comments for GetSessionDefaultCatalog and GetCu…

e81dc75

…rrentCatalog Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

msrathore-db had a problem deploying to azure-prod March 12, 2026 04:01 — with GitHub Actions Error

msrathore-db had a problem deploying to azure-prod March 12, 2026 05:32 — with GitHub Actions Error

msrathore-db force-pushed the fix/sea-driver-fixes branch from 83a9750 to da40538 Compare March 12, 2026 07:36

msrathore-db had a problem deploying to azure-prod March 12, 2026 07:36 — with GitHub Actions Error

msrathore-db had a problem deploying to azure-prod March 12, 2026 08:00 — with GitHub Actions Error

msrathore-db and others added 2 commits March 12, 2026 13:39

Merge remote-tracking branch 'origin/main' into fix/sea-driver-fixes

89d283e

msrathore-db had a problem deploying to azure-prod March 12, 2026 08:14 — with GitHub Actions Error

chore: add TODO to consolidate EffectiveCatalog with ResolveEffective…

f7694ed

…Catalog Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

msrathore-db temporarily deployed to azure-prod March 12, 2026 08:21 — with GitHub Actions Inactive

eric-wang-1990 reviewed Mar 12, 2026

View reviewed changes

csharp/src/StatementExecution/StatementExecutionConnection.cs Show resolved Hide resolved

eric-wang-1990 approved these changes Mar 12, 2026

View reviewed changes

msrathore-db had a problem deploying to azure-prod March 12, 2026 09:00 — with GitHub Actions Failure

fix: revert AffectedRows/RowCount default back to 0

edcd170

Reverts the ?? -1 change back to ?? 0 for both ExecuteQuery and ExecuteUpdate. SEA API returns 0 for DDL, not -1. Keeping original SEA behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

msrathore-db had a problem deploying to azure-prod March 12, 2026 09:22 — with GitHub Actions Failure

msrathore-db temporarily deployed to azure-prod March 12, 2026 12:57 — with GitHub Actions Inactive

msrathore-db added this pull request to the merge queue Mar 12, 2026

Merged via the queue into main with commit 4a962cc Mar 12, 2026
16 of 17 checks passed

msrathore-db deleted the fix/sea-driver-fixes branch March 12, 2026 13:40

Conversation

msrathore-db commented Mar 12, 2026

Summary

Error Handling

Catalog Handling

Metadata Fixes

Statement Options

Test Plan

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wang-1990 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

eric-wang-1990 commented Mar 12, 2026

Uh oh!

eric-wang-1990 commented Mar 12, 2026

Uh oh!

eric-wang-1990 commented Mar 12, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants