|
9 | 9 | - delta-lake |
10 | 10 | --- |
11 | 11 |
|
12 | | -Databricks as a connector for federated SQL query against Databricks using [Spark Connect](https://www.databricks.com/blog/2022/07/07/introducing-spark-connect-the-power-of-apache-spark-everywhere.html) or directly from [Delta Lake](https://delta.io/) tables. |
| 12 | +Databricks as a connector for federated SQL query against Databricks using [Spark Connect](https://www.databricks.com/blog/2022/07/07/introducing-spark-connect-the-power-of-apache-spark-everywhere.html), directly from [Delta Lake](https://delta.io/) tables, or using the [SQL Statement Execution API](https://docs.databricks.com/aws/en/dev-tools/sql-execution-tutorial). |
13 | 13 |
|
14 | 14 | ```yaml |
15 | 15 | datasets: |
@@ -62,6 +62,7 @@ Use the [secret replacement syntax](../secret-stores/index.md) to reference a se |
62 | 62 | | -------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | |
63 | 63 | | `mode` | The execution mode for querying against Databricks. The default is `spark_connect`. Possible values:<br /> <ul><li>`spark_connect`: Use Spark Connect to query against Databricks. Requires a Spark cluster to be available.</li><li>`delta_lake`: Query directly from Delta Tables. Requires the object store credentials to be provided.</li></ul> | |
64 | 64 | | `databricks_endpoint` | The endpoint of the Databricks instance. Required for both modes. | |
| 65 | +| `databricks_sql_warehouse_id` | The ID of the SQL Warehouse in Databricks to use for the query. Only valid when `mode` is `sql_warehouse`. | |
65 | 66 | | `databricks_cluster_id` | The ID of the compute cluster in Databricks to use for the query. Only valid when `mode` is `spark_connect`. | |
66 | 67 | | `databricks_use_ssl` | If true, use a TLS connection to connect to the Databricks endpoint. Default is `true`. | |
67 | 68 | | `client_timeout` | Optional. Applicable only in `delta_lake` mode. Specifies timeout for object store operations. Default value is `30s` E.g. `client_timeout: 60s` | |
@@ -157,6 +158,18 @@ Configure the connection to the object store when using `mode: delta_lake`. Use |
157 | 158 | databricks_token: ${secrets:my_token} |
158 | 159 | ``` |
159 | 160 |
|
| 161 | +### SQL Warehouse |
| 162 | + |
| 163 | +```yaml |
| 164 | +- from: databricks:spiceai.datasets.my_table # A reference to a table in the Databricks unity catalog |
| 165 | + name: my_table |
| 166 | + params: |
| 167 | + mode: sql_warehouse |
| 168 | + databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com |
| 169 | + databricks_sql_warehouse_id: 1234-567890-abcde123 |
| 170 | + databricks_token: ${secrets:my_token} |
| 171 | +``` |
| 172 | + |
160 | 173 | ### Delta Lake (S3) |
161 | 174 |
|
162 | 175 | ```yaml |
@@ -259,4 +272,4 @@ Memory limitations can be mitigated by storing acceleration data on disk, which |
259 | 272 |
|
260 | 273 | ## Cookbook |
261 | 274 |
|
262 | | -- A cookbook recipe to configure Databricks as data connector in Spice under `delta_lake` mode. [Spice on Databricks (mode: delta_lake)](https://github.com/spiceai/cookbook/tree/trunk/databricks/delta_lake#readme) |
| 275 | +- A cookbook recipe to configure Databricks as a data connector in Spice. [Spice on Databricks](https://github.com/spiceai/cookbook/tree/trunk/databricks) |
0 commit comments