Skip to content

Commit 0974e3d

Browse files
committed
docs: replace deployment guide stubs with production content
Populate all 23 RC/GA component deployment guides with real, implementation-based content covering authentication, resilience controls, capacity/sizing, metrics, task history, limitations, and troubleshooting. Removes all TODO markers.
1 parent 2d2305c commit 0974e3d

23 files changed

Lines changed: 1380 additions & 579 deletions

File tree

Lines changed: 92 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,60 +1,128 @@
11
---
22
title: 'Unity Catalog Catalog Connector Deployment Guide'
33
sidebar_label: 'Deployment Guide'
4-
description: 'Production operating guide for the Unity Catalog catalog connector: resilience controls, authentication, metrics, and observability.'
4+
description: 'Operating guide for the Unity Catalog catalog connector in production: workspace authentication, table-type filtering, effective-permissions flow, and observability.'
55
sidebar_position: 10
66
pagination_prev: null
77
pagination_next: null
88
tags:
99
- catalogs
10-
- deployment
10+
- unity-catalog
1111
- observability
1212
---
1313

14-
Production operating guide for the **Unity Catalog Catalog Connector** covering resilience tuning, authentication, capacity sizing, metrics, and observability.
14+
Production operating guide for the Unity Catalog catalog connector — discovering Databricks Unity Catalog tables and federating them through Spice.
1515

16-
:::info
17-
This deployment guide is a work in progress. For a complete reference example, see the [Databricks Deployment Guide](../../data-connectors/databricks/deployment).
18-
:::
16+
For Databricks-specific operational concerns (SQL Warehouse resilience, metrics, permissions flow as applied to Databricks workspaces), see the [Databricks Deployment Guide](../../data-connectors/databricks/deployment) — the Unity Catalog logic described there applies directly when the catalog connector targets a Databricks workspace.
1917

2018
## Authentication & Secrets
2119

22-
Guidance for production authentication, credential rotation, and secret store integration.
20+
| Parameter | Description |
21+
| ---------------------- | ------------------------------------------------------------------------------------ |
22+
| `unity_catalog_token` | Bearer token for the Unity Catalog API. Use `${secrets:...}` from a secret store. |
2323

24-
<!-- TODO: Document supported auth methods, required IAM/roles/permissions, recommended secret store, and rotation procedures. -->
24+
The catalog URL must match the pattern `https://<host>/api/2.1/unity-catalog/catalogs/<catalog_id>` and is parsed into the endpoint and catalog identifier at startup. Mismatched URLs are rejected as configuration errors.
25+
26+
The token is optional — when unset, the catalog connector issues unauthenticated requests, suitable for locally-hosted Unity Catalog deployments (OSS UC) with permissive access. For Databricks workspaces, the token is always required.
27+
28+
Secrets must be sourced from a [secret store](../../secret-stores/) in production. Rotate tokens from the UC / Databricks console and update the secret store.
2529

2630
## Resilience Controls
2731

28-
Production resilience parameters such as concurrency limits, retry budgets, backoff, and permanent-error handling.
32+
### HTTP Retry Policy
2933

30-
<!-- TODO: Document component-specific resilience parameters, defaults, and recommended overrides for production. -->
34+
The Unity Catalog client uses the shared `resilient_http` helper with these defaults:
3135

32-
## Capacity & Sizing
36+
- Maximum retries: **3**
37+
- Backoff: fibonacci
38+
- Retriable conditions: HTTP `408`, `429`, `5xx`, and transient network errors (connect, timeout)
39+
- Respects `Retry-After`, `retry-after-ms`, `x-retry-after-ms` headers
40+
- Maximum backoff: 300 seconds
41+
42+
These are not exposed as user-tunable parameters on the Unity Catalog connector itself.
43+
44+
### Discovery Concurrency
45+
46+
The connector fans out schema and table enumeration with bounded concurrency to avoid thundering-herd on the UC API:
47+
48+
- Schema refresh: up to **5** concurrent requests (`buffer_unordered(5)`)
49+
- Permission checks: up to **5** concurrent requests (`buffer_unordered(5)`)
50+
51+
For catalogs with thousands of tables, initial discovery can take minutes while the connector respects these limits.
52+
53+
## Table Type and Permission Handling
54+
55+
### Table Type Filtering
56+
57+
| Table Type | Supported | Notes |
58+
| ------------------- | --------- | -------------------------------------- |
59+
| `MANAGED` | Yes | Standard Delta tables |
60+
| `EXTERNAL` | Yes | Tables with external storage locations |
61+
| `FOREIGN` | Yes | Lakehouse Federation foreign tables |
62+
| `MATERIALIZED_VIEW` | Yes | Materialized views |
63+
| `VIEW` | No | Skipped during discovery |
64+
| `STREAMING_TABLE` | No | Skipped during discovery |
3365

34-
Recommended sizing guidance (CPU, memory, disk, network) and scaling behavior under load.
66+
Unsupported table types are skipped during catalog discovery. When referenced directly, an error is returned.
3567

36-
<!-- TODO: Document per-dataset resource expectations, batch sizing, and expected throughput characteristics. -->
68+
### Effective Permissions
69+
70+
Before creating a table provider, the connector checks permissions via `GET /api/2.1/unity-catalog/effective-permissions/table/{catalog.schema.table}`. The following privileges grant read access:
71+
72+
- `SELECT`
73+
- `ALL_PRIVILEGES` / `ALL PRIVILEGES`
74+
- `OWNER` / `OWNERSHIP`
75+
76+
**Behavior**:
77+
78+
- **Discovery**: Tables without read permission are skipped.
79+
- **Direct reference**: An `InsufficientPermissions` error is returned.
80+
- **Foreign tables**: The precheck is skipped (`requires_read_permission_validation = false`) because Lakehouse Federation access can be valid when the UC effective-permissions endpoint does not report a table-level privilege. Access is still enforced by Databricks at query time.
81+
- **Graceful degradation**: If the UC API is unreachable or returns an error for the permissions endpoint, discovery proceeds with a warning — table providers are still created, and any per-query authorization failures surface at query time.
82+
83+
## Capacity & Sizing
84+
85+
- **Initial discovery**: Scales with the number of schemas × tables. Bounded concurrency caps throughput; plan 5–30 minutes for catalogs with thousands of tables on a cold start.
86+
- **Refresh**: Catalog refresh re-enumerates schemas and tables at the configured interval. For very large catalogs, refresh less frequently (every few hours) unless schemas change rapidly.
87+
- **Permission-check cost**: One API call per table. The buffer of 5 caps concurrency.
3788

3889
## Metrics
3990

40-
Operational metrics exposed by the catalog connector. See [Component Metrics](../../../features/observability/component_metrics) for general configuration.
91+
The Unity Catalog connector does not currently register UC-specific OpenTelemetry metric instruments. When used via the Databricks connector, the shared SQL Warehouse and UC spans produce task-history records that can be aggregated for operational insight.
4192

42-
<!-- TODO: List component metrics (counter/gauge/histogram), their meaning, and how to enable them in the spicepod. -->
93+
Monitor via:
4394

44-
## Task History & Tracing
95+
- Spice query execution metrics (`query_duration_ms`, `query_processed_rows`) from `runtime.metrics`.
96+
- Task-history spans listed below.
97+
- Databricks / UC workspace audit logs for API-level visibility.
4598

46-
Spans emitted by the catalog connector for the [task history](../../../reference/task_history) system.
99+
See [Component Metrics](../../../features/observability/component_metrics) for general configuration.
47100

48-
<!-- TODO: List span names and input/output fields, and any trace attributes specific to this component. -->
101+
## Task History
49102

50-
## Known Limitations
103+
Unity Catalog operations emit the following [task history](../../../reference/task_history) spans:
51104

52-
Any production limitations, compatibility caveats, or unsupported features.
105+
| Span | Input | Description |
106+
| ------------------------------ | ----------------------------- | ---------------------------------------- |
107+
| `uc_get_table` | Fully-qualified table name | Fetch table metadata from Unity Catalog. |
108+
| `uc_get_catalog` | Catalog ID | Fetch catalog metadata. |
109+
| `uc_list_schemas` | Catalog ID | List schemas in a catalog. |
110+
| `uc_list_tables` | `catalog_id.schema_name` | List tables in a schema. |
111+
| `uc_get_effective_permissions` | Fully-qualified table name | Check effective permissions for a table. |
53112

54-
<!-- TODO: Document known limitations (data types, query patterns, concurrency ceilings, etc.). -->
113+
## Known Limitations
55114

56-
## Troubleshooting
115+
- **VIEW and STREAMING_TABLE are skipped**: Only queryable table types are exposed.
116+
- **No UC write-back**: The connector is read-only; writes to UC are not supported through Spice.
117+
- **HTTP retry/concurrency parameters not exposed**: The resilient-HTTP defaults (3 retries, fibonacci backoff, concurrency 5) are not currently user-tunable on the UC connector.
118+
- **Graceful degradation on permission-endpoint failures**: If UC effective-permissions is unreachable, Spice proceeds; authorization errors surface at query time rather than discovery time.
57119

58-
Common failure modes and resolutions.
120+
## Troubleshooting
59121

60-
<!-- TODO: Document common errors, diagnostic steps, and recovery procedures. -->
122+
| Symptom | Likely cause | Resolution |
123+
| ----------------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------- |
124+
| `401 Unauthorized` on catalog list | Missing, expired, or wrong-workspace token. | Regenerate token in UC / Databricks; update secret store. |
125+
| Table visible in UC but missing from the Spice catalog | Table type is VIEW / STREAMING_TABLE or permissions were denied. | Confirm table type is supported and that the principal has `SELECT` (or equivalent). |
126+
| `InsufficientPermissions` on direct table reference | Role lacks read privilege on the table. | Grant `SELECT` on the table in UC. |
127+
| Slow catalog discovery on thousands of tables | Bounded concurrency + permission checks per table. | Expected behavior; schedule discovery during low-traffic windows and cache via accelerated datasets. |
128+
| Tables from a Lakehouse Federation source missing | FOREIGN precheck passed but Databricks denied at query time. | Verify the Databricks workspace has federation privileges granted to the principal. |
Lines changed: 36 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,60 +1,69 @@
11
---
22
title: 'Arrow Data Accelerator Deployment Guide'
33
sidebar_label: 'Deployment Guide'
4-
description: 'Production operating guide for the Arrow data accelerator: resilience controls, authentication, metrics, and observability.'
4+
description: 'Operating guide for the Arrow (in-memory) data accelerator in production: memory sizing, indexes, and observability.'
55
sidebar_position: 10
66
pagination_prev: null
77
pagination_next: null
88
tags:
99
- data-accelerators
10-
- deployment
10+
- arrow
1111
- observability
1212
---
1313

14-
Production operating guide for the **Arrow Data Accelerator** covering resilience tuning, authentication, capacity sizing, metrics, and observability.
15-
16-
:::info
17-
This deployment guide is a work in progress. For a complete reference example, see the [Databricks Deployment Guide](../../data-connectors/databricks/deployment).
18-
:::
14+
Production operating guide for the Arrow in-memory data accelerator covering memory sizing, optional hash indexes, and observability.
1915

2016
## Authentication & Secrets
2117

22-
Guidance for production authentication, credential rotation, and secret store integration.
23-
24-
<!-- TODO: Document supported auth methods, required IAM/roles/permissions, recommended secret store, and rotation procedures. -->
18+
The Arrow accelerator is an in-process, in-memory engine. There is no external storage and no authentication or secret management required.
2519

26-
## Resilience Controls
20+
## Resilience & Durability
2721

28-
Production resilience parameters such as concurrency limits, retry budgets, backoff, and permanent-error handling.
22+
The Arrow accelerator is **not durable**. Data is held in RAM and is lost on process restart; every restart re-materializes the dataset from the source connector.
2923

30-
<!-- TODO: Document component-specific resilience parameters, defaults, and recommended overrides for production. -->
24+
- **Crash recovery**: None — on restart, the dataset is refreshed from scratch.
25+
- **File modes**: File-mode acceleration is rejected at startup; Arrow is memory-only. Use [DuckDB](../duckdb/deployment), [SQLite](../sqlite/deployment), [PostgreSQL](../postgres/deployment), or [Cayenne](../cayenne/deployment) when durability or spill is required.
26+
- **Concurrency**: Arrow reads are lock-free. Refresh cadence is controlled by the runtime refresh semaphore, not by the accelerator itself.
3127

3228
## Capacity & Sizing
3329

34-
Recommended sizing guidance (CPU, memory, disk, network) and scaling behavior under load.
35-
36-
<!-- TODO: Document per-dataset resource expectations, batch sizing, and expected throughput characteristics. -->
30+
- **Memory**: Plan for 1.0–1.5× the raw row-oriented size of the source data, plus overhead for string dictionaries. Use the source connector's schema and row count to estimate.
31+
- **Hash index**: Optional, disabled by default. When enabled via `hash_index: enabled`, a hash map is built over the primary-key columns. Build time scales linearly with rows; memory overhead is approximately 24–48 bytes per row plus the key size.
32+
- **Startup cost**: Full-dataset materialization happens on startup. For tables larger than ~1 GB, consider a durable accelerator to avoid repeated full refresh on every restart.
3733

3834
## Metrics
3935

40-
Operational metrics exposed by the data accelerator. See [Component Metrics](../../../features/observability/component_metrics) for general configuration.
36+
Generic acceleration metrics are available with the `dataset_acceleration_` prefix. Hash-index operations emit dedicated metrics when the index is enabled:
4137

42-
<!-- TODO: List component metrics (counter/gauge/histogram), their meaning, and how to enable them in the spicepod. -->
38+
| Metric | Type | Description |
39+
| ---------------------------------- | --------- | --------------------------------------------------------- |
40+
| `hash_index_builds` | Counter | Total hash-index builds (one per refresh). |
41+
| `hash_index_build_duration_ms` | Histogram | Time to build the hash index. |
42+
| `hash_index_entries` | Gauge | Number of entries in the index. |
43+
| `hash_index_memory_bytes` | Gauge | Approximate memory footprint of the index. |
44+
| `hash_index_lookups` | Counter | Total hash-index lookups performed by queries. |
45+
| `hash_index_lookup_rows` | Counter | Total rows returned via hash-index lookups. |
4346

44-
## Task History & Tracing
47+
See [Component Metrics](../../../features/observability/component_metrics) for enabling and exporting metrics. Refresh metrics are described in [Acceleration](../../../features/data-acceleration/).
4548

46-
Spans emitted by the data accelerator for the [task history](../../../reference/task_history) system.
49+
## Task History
4750

48-
<!-- TODO: List span names and input/output fields, and any trace attributes specific to this component. -->
51+
Arrow acceleration operations (refresh, query) participate in [task history](../../../reference/task_history) through the shared acceleration spans (`accelerated_table_refresh`, `sql_query`). No Arrow-specific spans are emitted — the accelerator is a thin wrapper over Arrow memory.
4952

5053
## Known Limitations
5154

52-
Any production limitations, compatibility caveats, or unsupported features.
53-
54-
<!-- TODO: Document known limitations (data types, query patterns, concurrency ceilings, etc.). -->
55+
- **No persistence**: Every restart refreshes from the source.
56+
- **No traditional indexes**: Arrow does not support B-tree indexes. Hash index provides point-lookup acceleration but not range or sort-order optimization.
57+
- **Only primary-key hash index**: The hash index requires a `primary_key` constraint; `unique` constraints alone do not enable the index.
58+
- **Memory pressure**: If the dataset exceeds available RAM, the runtime will OOM; no spill-to-disk mechanism exists in the Arrow accelerator itself.
59+
- **`partition_by`**: Not applicable — Arrow accelerator holds a single in-memory representation.
5560

5661
## Troubleshooting
5762

58-
Common failure modes and resolutions.
59-
60-
<!-- TODO: Document common errors, diagnostic steps, and recovery procedures. -->
63+
| Symptom | Likely cause | Resolution |
64+
| ------------------------------------------------ | ------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
65+
| OOM on refresh | Source dataset larger than RAM. | Switch to a durable accelerator (DuckDB / SQLite / Cayenne) that supports spill to disk. |
66+
| Long startup time | Full-dataset refresh runs on boot. | Switch to a durable accelerator so refresh is incremental, not full, on restart. |
67+
| `hash_index` ignored | No primary-key constraint on the dataset. | Add `primary_key:` to the dataset definition; hash index activates automatically. |
68+
| Query slow for point lookups | Hash index disabled or wrong key column. | Enable `hash_index: enabled`; ensure the query filter matches the primary-key columns. |
69+
| Accelerator refuses to start with file mode | Arrow rejects file-mode acceleration. | Switch `engine:` to `duckdb`, `sqlite`, `postgres`, or `cayenne`. |

0 commit comments

Comments
 (0)