|
| 1 | +--- |
| 2 | +title: 'Unity Catalog Catalog Connector Deployment Guide' |
| 3 | +sidebar_label: 'Deployment Guide' |
| 4 | +description: 'Operating guide for the Unity Catalog catalog connector in production: workspace authentication, table-type filtering, effective-permissions flow, and observability.' |
| 5 | +sidebar_position: 10 |
| 6 | +pagination_prev: null |
| 7 | +pagination_next: null |
| 8 | +tags: |
| 9 | + - catalogs |
| 10 | + - unity-catalog |
| 11 | + - observability |
| 12 | +--- |
| 13 | + |
| 14 | +Production operating guide for the Unity Catalog catalog connector — discovering Databricks Unity Catalog tables and federating them through Spice. |
| 15 | + |
| 16 | +For Databricks-specific operational concerns (SQL Warehouse resilience, metrics, permissions flow as applied to Databricks workspaces), see the [Databricks Deployment Guide](../../data-connectors/databricks/deployment) — the Unity Catalog logic described there applies directly when the catalog connector targets a Databricks workspace. |
| 17 | + |
| 18 | +## Authentication & Secrets |
| 19 | + |
| 20 | +| Parameter | Description | |
| 21 | +| ---------------------- | ------------------------------------------------------------------------------------ | |
| 22 | +| `unity_catalog_token` | Bearer token for the Unity Catalog API. Use `${secrets:...}` from a secret store. | |
| 23 | + |
| 24 | +The catalog URL must match the pattern `https://<host>/api/2.1/unity-catalog/catalogs/<catalog_id>` and is parsed into the endpoint and catalog identifier at startup. Mismatched URLs are rejected as configuration errors. |
| 25 | + |
| 26 | +The token is optional — when unset, the catalog connector issues unauthenticated requests, suitable for locally-hosted Unity Catalog deployments (OSS UC) with permissive access. For Databricks workspaces, the token is always required. |
| 27 | + |
| 28 | +Secrets must be sourced from a [secret store](../../secret-stores/) in production. Rotate tokens from the UC / Databricks console and update the secret store. |
| 29 | + |
| 30 | +## Resilience Controls |
| 31 | + |
| 32 | +### HTTP Retry Policy |
| 33 | + |
| 34 | +The Unity Catalog client uses the shared `resilient_http` helper with these defaults: |
| 35 | + |
| 36 | +- Maximum retries: **3** |
| 37 | +- Backoff: fibonacci |
| 38 | +- Retriable conditions: HTTP `408`, `429`, `5xx`, and transient network errors (connect, timeout) |
| 39 | +- Respects `Retry-After`, `retry-after-ms`, `x-retry-after-ms` headers |
| 40 | +- Maximum backoff: 300 seconds |
| 41 | + |
| 42 | +These are not exposed as user-tunable parameters on the Unity Catalog connector itself. |
| 43 | + |
| 44 | +### Discovery Concurrency |
| 45 | + |
| 46 | +The connector fans out schema and table enumeration with bounded concurrency to avoid thundering-herd on the UC API: |
| 47 | + |
| 48 | +- Schema refresh: up to **5** concurrent requests (`buffer_unordered(5)`) |
| 49 | +- Permission checks: up to **5** concurrent requests (`buffer_unordered(5)`) |
| 50 | + |
| 51 | +For catalogs with thousands of tables, initial discovery can take minutes while the connector respects these limits. |
| 52 | + |
| 53 | +## Table Type and Permission Handling |
| 54 | + |
| 55 | +### Table Type Filtering |
| 56 | + |
| 57 | +| Table Type | Supported | Notes | |
| 58 | +| ------------------- | --------- | -------------------------------------- | |
| 59 | +| `MANAGED` | Yes | Standard Delta tables | |
| 60 | +| `EXTERNAL` | Yes | Tables with external storage locations | |
| 61 | +| `FOREIGN` | Yes | Lakehouse Federation foreign tables | |
| 62 | +| `MATERIALIZED_VIEW` | Yes | Materialized views | |
| 63 | +| `VIEW` | No | Skipped during discovery | |
| 64 | +| `STREAMING_TABLE` | No | Skipped during discovery | |
| 65 | + |
| 66 | +Unsupported table types are skipped during catalog discovery. When referenced directly, an error is returned. |
| 67 | + |
| 68 | +### Effective Permissions |
| 69 | + |
| 70 | +Before creating a table provider, the connector checks permissions via `GET /api/2.1/unity-catalog/effective-permissions/table/{catalog.schema.table}`. The following privileges grant read access: |
| 71 | + |
| 72 | +- `SELECT` |
| 73 | +- `ALL_PRIVILEGES` / `ALL PRIVILEGES` |
| 74 | +- `OWNER` / `OWNERSHIP` |
| 75 | + |
| 76 | +**Behavior**: |
| 77 | + |
| 78 | +- **Discovery**: Tables without read permission are skipped. |
| 79 | +- **Direct reference**: An `InsufficientPermissions` error is returned. |
| 80 | +- **Foreign tables**: The precheck is skipped (`requires_read_permission_validation = false`) because Lakehouse Federation access can be valid when the UC effective-permissions endpoint does not report a table-level privilege. Access is still enforced by Databricks at query time. |
| 81 | +- **Graceful degradation**: If the UC API is unreachable or returns an error for the permissions endpoint, discovery proceeds with a warning — table providers are still created, and any per-query authorization failures surface at query time. |
| 82 | + |
| 83 | +## Capacity & Sizing |
| 84 | + |
| 85 | +- **Initial discovery**: Scales with the number of schemas × tables. Bounded concurrency caps throughput; plan 5–30 minutes for catalogs with thousands of tables on a cold start. |
| 86 | +- **Refresh**: Catalog refresh re-enumerates schemas and tables at the configured interval. For very large catalogs, refresh less frequently (every few hours) unless schemas change rapidly. |
| 87 | +- **Permission-check cost**: One API call per table. The buffer of 5 caps concurrency. |
| 88 | + |
| 89 | +## Metrics |
| 90 | + |
| 91 | +The Unity Catalog connector does not currently register UC-specific OpenTelemetry metric instruments. When used via the Databricks connector, the shared SQL Warehouse and UC spans produce task-history records that can be aggregated for operational insight. |
| 92 | + |
| 93 | +Monitor via: |
| 94 | + |
| 95 | +- Spice query execution metrics (`query_duration_ms`, `query_processed_rows`) from `runtime.metrics`. |
| 96 | +- Task-history spans listed below. |
| 97 | +- Databricks / UC workspace audit logs for API-level visibility. |
| 98 | + |
| 99 | +See [Component Metrics](../../../features/observability/component_metrics) for general configuration. |
| 100 | + |
| 101 | +## Task History |
| 102 | + |
| 103 | +Unity Catalog operations emit the following [task history](../../../reference/task_history) spans: |
| 104 | + |
| 105 | +| Span | Input | Description | |
| 106 | +| ------------------------------ | ----------------------------- | ---------------------------------------- | |
| 107 | +| `uc_get_table` | Fully-qualified table name | Fetch table metadata from Unity Catalog. | |
| 108 | +| `uc_get_catalog` | Catalog ID | Fetch catalog metadata. | |
| 109 | +| `uc_list_schemas` | Catalog ID | List schemas in a catalog. | |
| 110 | +| `uc_list_tables` | `catalog_id.schema_name` | List tables in a schema. | |
| 111 | +| `uc_get_effective_permissions` | Fully-qualified table name | Check effective permissions for a table. | |
| 112 | + |
| 113 | +## Known Limitations |
| 114 | + |
| 115 | +- **VIEW and STREAMING_TABLE are skipped**: Only queryable table types are exposed. |
| 116 | +- **No UC write-back**: The connector is read-only; writes to UC are not supported through Spice. |
| 117 | +- **HTTP retry/concurrency parameters not exposed**: The resilient-HTTP defaults (3 retries, fibonacci backoff, concurrency 5) are not currently user-tunable on the UC connector. |
| 118 | +- **Graceful degradation on permission-endpoint failures**: If UC effective-permissions is unreachable, Spice proceeds; authorization errors surface at query time rather than discovery time. |
| 119 | + |
| 120 | +## Troubleshooting |
| 121 | + |
| 122 | +| Symptom | Likely cause | Resolution | |
| 123 | +| ----------------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------- | |
| 124 | +| `401 Unauthorized` on catalog list | Missing, expired, or wrong-workspace token. | Regenerate token in UC / Databricks; update secret store. | |
| 125 | +| Table visible in UC but missing from the Spice catalog | Table type is VIEW / STREAMING_TABLE or permissions were denied. | Confirm table type is supported and that the principal has `SELECT` (or equivalent). | |
| 126 | +| `InsufficientPermissions` on direct table reference | Role lacks read privilege on the table. | Grant `SELECT` on the table in UC. | |
| 127 | +| Slow catalog discovery on thousands of tables | Bounded concurrency + permission checks per table. | Expected behavior; schedule discovery during low-traffic windows and cache via accelerated datasets. | |
| 128 | +| Tables from a Lakehouse Federation source missing | FOREIGN precheck passed but Databricks denied at query time. | Verify the Databricks workspace has federation privileges granted to the principal. | |
0 commit comments