Skip to content

Commit 65cd4bc

Browse files
authored
Merge branch 'trunk' into fix/imap-ssl-mode-default
2 parents b0bee4b + 047779e commit 65cd4bc

44 files changed

Lines changed: 67 additions & 71 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

website/docs/components/catalogs/databricks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,7 @@ catalogs:
188188
One of the following auth values must be provided for Azure Blob:
189189

190190
- `databricks_azure_storage_account_key`,
191-
- `databricks_azure_storage_client_id` and `azure_storage_client_secret`, or
191+
- `databricks_azure_storage_client_id` and `databricks_azure_storage_client_secret`, or
192192
- `databricks_azure_storage_sas_key`.
193193
:::
194194

website/docs/components/catalogs/unity-catalog/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ The `dataset_params` field is used to configure the dataset-specific parameters
6666
One of the following auth values must be provided for Azure Blob:
6767

6868
- `unity_catalog_azure_storage_account_key`,
69-
- `unity_catalog_azure_storage_client_id` and `azure_storage_client_secret`, or
69+
- `unity_catalog_azure_storage_client_id` and `unity_catalog_azure_storage_client_secret`, or
7070
- `unity_catalog_azure_storage_sas_key`.
7171
:::
7272

website/docs/components/data-connectors/databricks/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ Configure the connection to the object store when using `mode: delta_lake`. Use
143143
**One** of the following auth values must be provided for Azure Blob:
144144

145145
- `databricks_azure_storage_account_key`,
146-
- `databricks_azure_storage_client_id` and `azure_storage_client_secret`, or
146+
- `databricks_azure_storage_client_id` and `databricks_azure_storage_client_secret`, or
147147
- `databricks_azure_storage_sas_key`.
148148
:::
149149

website/docs/components/data-connectors/dremio/deployment.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,13 @@ Production operating guide for the Dremio data connector covering authentication
1515

1616
## Authentication & Secrets
1717

18-
The Dremio connector connects over [Arrow Flight SQL](https://arrow.apache.org/docs/format/FlightSql.html) with username/password or personal-access-token (PAT) authentication.
18+
The Dremio connector connects over [Arrow Flight SQL](https://arrow.apache.org/docs/format/FlightSql.html) with username/password authentication.
1919

2020
| Parameter | Description |
2121
| ---------------------- | --------------------------------------------------------------------------------------------------------------- |
2222
| `dremio_endpoint` | Flight SQL endpoint, e.g. `grpc+tls://dremio.internal:32010`. |
23-
| `dremio_username` | Dremio user (username/password auth). |
23+
| `dremio_username` | Dremio user. |
2424
| `dremio_password` | Dremio password. Use `${secrets:...}` from a secret store. |
25-
| `dremio_token` | Alternatively, a PAT or session token. |
2625

2726
Use TLS endpoints (`grpc+tls://`) in production. Credentials must be sourced from a [secret store](../../secret-stores/).
2827

website/docs/components/data-connectors/duckdb/deployment.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,7 @@ DuckDB is an embedded engine; the connector reads a local DuckDB database file.
1919

2020
| Parameter | Description |
2121
| ------------------ | ---------------------------------------------------------------------- |
22-
| `open` | Absolute path to the DuckDB database file. |
23-
| `duckdb_connection_string` | Alternative: DuckDB connection URI with options. |
22+
| `duckdb_open` | Absolute path to the DuckDB database file. If omitted, uses in-memory mode. |
2423

2524
Protect the DuckDB file with filesystem permissions. Store it on encrypted storage (LUKS/dm-crypt, EBS encryption, etc.) for data-at-rest protection. For data loaded from cloud object stores inside DuckDB, configure AWS/Azure/GCS credentials via DuckDB extensions rather than Spice parameters.
2625

website/docs/components/data-connectors/graphql/deployment.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,10 @@ Authentication is endpoint-specific. The connector supports arbitrary HTTP heade
1919

2020
| Parameter | Description |
2121
| -------------------------- | ---------------------------------------------------------------------------- |
22-
| `graphql_endpoint` | GraphQL endpoint URL. |
23-
| `graphql_auth_header` | Authorization header. Typically `"Bearer ${secrets:api_token}"`. |
22+
| `graphql_auth_header` | Custom authorization header name. The value of `graphql_auth_token` is sent as this header's value. |
23+
| `graphql_auth_token` | Bearer token for GraphQL requests. Typically `"${secrets:api_token}"`. |
2424
| `graphql_query` | The GraphQL query to execute. |
25-
| `graphql_json_pointer` | RFC-6901 JSON pointer to the row collection inside the response (e.g. `/data/repository/issues/nodes`). |
26-
| `graphql_pagination_parameters` | Cursor / page-size configuration for pagination (see the connector reference). |
25+
| `json_pointer` | RFC-6901 JSON pointer to the row collection inside the response (e.g. `/data/repository/issues/nodes`). |
2726

2827
Tokens must be sourced from a [secret store](../../secret-stores/) in production.
2928

@@ -39,7 +38,7 @@ HTTP-level retries follow the shared `resilient_http` policy: 408/429/5xx plus t
3938

4039
### Pagination
4140

42-
The connector supports cursor-based pagination. Each page is a separate HTTP request; pagination errors mid-sequence cause the entire refresh to fail. Use `graphql_json_pointer` to select the row collection and configure the pagination variables to match the upstream schema's cursor fields.
41+
The connector supports cursor-based pagination. Each page is a separate HTTP request; pagination errors mid-sequence cause the entire refresh to fail. Use `json_pointer` to select the row collection and configure the pagination variables to match the upstream schema's cursor fields.
4342

4443
### Server Rate Limits
4544

@@ -77,7 +76,7 @@ GraphQL requests participate in [task history](../../../reference/task_history)
7776
| Symptom | Likely cause | Resolution |
7877
| ---------------------------------------------- | ----------------------------------------------------- | ------------------------------------------------------------------------------------------- |
7978
| `401 Unauthorized` | Wrong or expired token in `graphql_auth_header`. | Rotate the token; verify the header format (`Bearer` prefix, etc.). |
80-
| Rows missing from the dataset | Wrong `graphql_json_pointer`. | Inspect the response payload; JSON pointer must navigate to the array of rows. |
79+
| Rows missing from the dataset | Wrong `json_pointer`. | Inspect the response payload; JSON pointer must navigate to the array of rows. |
8180
| Refresh fails mid-pagination | Rate-limit or transient network failure. | Reduce refresh frequency; the connector will retry on retriable errors. Narrow the query. |
8281
| Query cost exceeded | Query requests too many nested fields. | Simplify the query; fetch only required fields. |
8382
| Inferred schema differs between refreshes | Optional fields appear/disappear in responses. | Provide an explicit dataset `schema` to lock down types. |

website/docs/components/data-connectors/mysql/deployment.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,8 @@ TLS is controlled via `mysql_sslmode`:
3737
| `disabled` | No TLS. |
3838
| `preferred` | Try TLS, fall back to plaintext. Not recommended for production. |
3939
| `required` | Require TLS; do **not** verify the server certificate. |
40-
| `verify_ca` | Require TLS and verify the CA chain. |
4140

42-
For production, use `verify_ca` with `mysql_sslrootcert` pointing to the CA bundle. The default is `required`, which encrypts the connection but does not validate the server's identity.
41+
For production, use `required` with `mysql_sslrootcert` pointing to the CA bundle. The default is `required`, which encrypts the connection but does not validate the server's identity.
4342

4443
## Resilience Controls
4544

@@ -49,18 +48,18 @@ The connector uses a per-dataset connection pool with the following defaults:
4948

5049
| Parameter | Default | Description |
5150
| --------------------------- | ------- | ---------------------------------------------- |
52-
| `connection_pool_min_idle` | `1` | Minimum idle connections held by the pool. |
53-
| `connection_pool_size` | `5` | Maximum connections the pool will open. |
51+
| `mysql_pool_min` | `1` | Minimum idle connections held by the pool. |
52+
| `mysql_pool_max` | `5` | Maximum connections the pool will open. |
5453

55-
Invalid values (non-integers) are logged as a warning and silently replaced with the defaults. Size the pool to match the concurrent query and refresh load for the dataset; the upper bound should respect the MySQL server's `max_connections` budget shared across all Spice datasets and external clients.
54+
Invalid values (non-integers) are logged as a warning and silently replaced with the defaults. Size the pool to match the concurrent query and refresh load for the dataset; the upper bound should respect the MySQL server's `max_connections` budget shared across all Spice datasets and external clients. `mysql_pool_min` must be less than or equal to `mysql_pool_max`; conflicting values are rejected at startup.
5655

5756
### Retry Behavior
5857

5958
Transient query failures are not automatically retried at the connector layer. Dataset refresh retries are controlled by the acceleration refresh policy (see [Data Refresh](../../../features/data-acceleration/data-refresh)). Connection failures surface to the caller and to the connection pool metrics below.
6059

6160
## Capacity & Sizing
6261

63-
- **Network**: MySQL traffic is TCP. Plan for the sum of `connection_pool_size` across all Spice datasets targeting the same MySQL instance when sizing database `max_connections`.
62+
- **Network**: MySQL traffic is TCP. Plan for the sum of `mysql_pool_max` across all Spice datasets targeting the same MySQL instance when sizing database `max_connections`.
6463
- **Memory**: Each pooled connection holds a small amount of client-side state. Result sets stream in batches; memory footprint for federated reads is bounded by DataFusion's record batch size (8192 rows default).
6564
- **Throughput**: For full-table materialization (acceleration refresh), query latency scales with the source table size and the presence of indexes on the refresh partitioning/ordering columns.
6665

@@ -74,15 +73,15 @@ The MySQL connector exposes observable metrics for its connection pool. Enable t
7473
| `connections_in_pool` | ObservableGauge | Idle connections sitting in the pool. |
7574
| `active_wait_requests` | ObservableGauge | Requests waiting for a connection (saturation signal). |
7675
| `create_failed` | Counter | Connections that failed to be created. |
77-
| `discarded_excess_idle_connection` | Counter | Connections closed because the pool already had enough idle connections. |
76+
| `discarded_superfluous_connection` | Counter | Connections closed because the pool already had enough idle connections. |
7877
| `discarded_unestablished_connection`| Counter | Connections closed because they could not be established. |
7978
| `dirty_connection_return` | Counter | Connections returned to the pool in a dirty state (open transactions, pending queries, etc.). |
8079

8180
Metric instruments are exposed with the prefix `dataset_mysql_`. Each instrument carries a `name` attribute set to the dataset name.
8281

8382
Key signals to alert on:
8483

85-
- `active_wait_requests > 0` sustained → pool is saturated, increase `connection_pool_size` or the server's `max_connections`.
84+
- `active_wait_requests > 0` sustained → pool is saturated, increase `mysql_pool_max` or the server's `max_connections`.
8685
- `create_failed` increasing → credentials, network, or server availability problem.
8786
- `dirty_connection_return` increasing → a query is not cleaning up its transaction state; investigate long-running or aborted queries.
8887

@@ -93,7 +92,7 @@ MySQL operations participate in Spice [task history](../../../reference/task_his
9392
## Known Limitations
9493

9594
- Only TCP connections are supported. Unix socket connections are not exposed through Spice configuration.
96-
- TLS with full hostname verification (`verify_identity`) is not a documented option; use `verify_ca` with a trusted CA bundle.
95+
- TLS with certificate verification (`verify_ca`, `verify_identity`) is not supported; only `disabled`, `preferred`, and `required` modes are available.
9796
- Large text/blob columns are fetched in their entirety per row; consider selecting only the columns you need when federating.
9897
- `mysql_sslmode: preferred` silently downgrades to plaintext on TLS negotiation failure and is not recommended for production.
9998

@@ -102,7 +101,7 @@ MySQL operations participate in Spice [task history](../../../reference/task_his
102101
| Symptom | Likely cause | Resolution |
103102
| --------------------------------------------------------- | --------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
104103
| `Access denied for user` | Incorrect credentials or user lacks `SELECT` on the DB. | Verify credentials; confirm the user has read access on the required tables. |
105-
| `Too many connections` | Sum of Spice pool sizes + other clients exceeds server `max_connections`. | Reduce `connection_pool_size` or raise the server limit. |
106-
| Sustained `active_wait_requests > 0` | Pool saturation. | Increase `connection_pool_size`; reduce concurrent dataset refreshes. |
107-
| `SSL connection error` | Certificate mismatch with `mysql_sslmode: verify_ca`. | Verify `mysql_sslrootcert` matches the server's issuing CA. Use `openssl s_client -connect` to inspect. |
108-
| Silent plaintext connection | `mysql_sslmode: preferred` falling back. | Switch to `required` or `verify_ca`. |
104+
| `Too many connections` | Sum of Spice pool sizes + other clients exceeds server `max_connections`. | Reduce `mysql_pool_max` or raise the server limit. |
105+
| Sustained `active_wait_requests > 0` | Pool saturation. | Increase `mysql_pool_max`; reduce concurrent dataset refreshes. |
106+
| `SSL connection error` | Certificate mismatch or TLS negotiation failure. | Verify `mysql_sslrootcert` matches the server's issuing CA. Use `openssl s_client -connect` to inspect. |
107+
| Silent plaintext connection | `mysql_sslmode: preferred` falling back. | Switch to `required`. |

website/docs/components/data-connectors/postgres/deployment.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -49,18 +49,18 @@ For production, use `verify-full` with `pg_sslrootcert` pointing to the CA bundl
4949

5050
The connector maintains a per-dataset connection pool:
5151

52-
| Parameter | Default | Description |
53-
| --------------------------- | ------- | ---------------------------------------------------- |
54-
| `connection_pool_min_idle` | Tracks `connection_pool_size` with a floor of 1. | Minimum idle connections held by the pool. |
55-
| `connection_pool_size` | `10` | Maximum connections the pool will open. |
52+
| Parameter | Default | Description |
53+
| ------------------------------- | ------- | ---------------------------------------------------- |
54+
| `pg_connection_pool_min_idle` | `1` | Minimum idle connections held by the pool. |
55+
| `connection_pool_size` | `5` | Maximum connections the pool will open. |
5656

57-
`connection_pool_min_idle` must be less than or equal to `connection_pool_size`; conflicting values are rejected as configuration errors at startup.
57+
`pg_connection_pool_min_idle` must be less than or equal to `connection_pool_size`; conflicting values are rejected as configuration errors at startup.
5858

5959
Size the pool to match concurrent query and refresh load for the dataset. The server's `max_connections` (default 100) is a shared budget across Spice datasets, other clients, and server-side background workers — plan accordingly, or front Postgres with PgBouncer.
6060

6161
### Application Name
6262

63-
`pg_application_name` defaults to the Spice.ai version string, which surfaces in `pg_stat_activity.application_name`. Override this to distinguish traffic from multiple Spice instances or environments.
63+
The connector automatically sets `application_name` to the Spice.ai version string, which surfaces in `pg_stat_activity.application_name`. This value is not configurable.
6464

6565
### Retry Behavior
6666

@@ -105,7 +105,7 @@ PostgreSQL operations participate in Spice [task history](../../../reference/tas
105105
| -------------------------------------------- | -------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
106106
| `FATAL: password authentication failed` | Incorrect credentials. | Verify credentials via the secret store; test with `psql` using the same credentials. |
107107
| `FATAL: too many clients already` | Pool size + other clients exceeds server `max_connections`. | Reduce `connection_pool_size` or raise `max_connections` / front the server with PgBouncer. |
108-
| `connection_pool_min_idle must be <= connection_pool_size` at startup | Misconfiguration. | Correct the values so `min_idle <= size`. |
108+
| `pg_connection_pool_min_idle must be <= connection_pool_size` at startup | Misconfiguration. | Correct the values so `pg_connection_pool_min_idle <= connection_pool_size`. |
109109
| Sustained `active_wait_requests > 0` | Pool saturation. | Increase `connection_pool_size` or reduce concurrent refreshes. |
110110
| `certificate verify failed` | `pg_sslmode: verify-ca` / `verify-full` with wrong CA or hostname. | Verify `pg_sslrootcert` matches the server's issuing CA; with `verify-full` ensure hostname matches SAN. |
111-
| Sessions lingering with the default app name | Multiple Spice instances share the same name. | Set `pg_application_name` per deployment for clear `pg_stat_activity` attribution. |
111+
| Sessions lingering with the default app name | Multiple Spice instances share the same version-based name. | The `application_name` is auto-set to the Spice.ai version and is not currently configurable. |

0 commit comments

Comments
 (0)