Skip to content

Commit 166dced

Browse files
lukekimphillipleblancJeadieewgeniuskrinart
authored
Merge v1.11 docs to trunk and set v2.0 as next (#1345)
* Remove OTel ports from docs (#1270) * Docs for Google (#1286) Co-authored-by: Evgenii Khramkov <evgenii@spice.ai> Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com> * Parameterized Queries docs (#1298) * Paramterized Queries docs * Formatting * Add SMB and NFS Data Connector Docs (#1295) * Add SMB & NFS Data Connector docs * Fixes * formatting * Rename "params" key (#1300) * Fix params (#1301) * Update Dynamodb Authentication (#1304) * Update Dynamodb Auth * Update * Cayenne: document cayenne_file_path and cayenne_metadata_dir (#1307) * Update snapshots documentation (#1318) * Update snapshots documentation * Fix * Docs for snapshots_reset_expiry_on_load (#1322) * Minor fixes for DynamoDB (#1323) * Minor fixes for DynamoDB * Minor fix * Update Distributed Query docs for v1.11 changes (#1326) * Update Distributed Query docs for v1.11 changes * Update website/docs/features/distributed-query/index.md Co-authored-by: Jack Eadie <jack@spice.ai> --------- Co-authored-by: Jack Eadie <jack@spice.ai> * ScyllaDB Data Connector docs (#1325) * Add Arrow Hash Index docs (#1324) * Add Arrow Hash Index docs * Formatting * Add versioning support (#1308) * Add versioning support * Fix: empty versions array until release branches exist * Enable versioned docs for release/1.11 branch * Fix: resolve git refs with origin/ prefix for CI * Fix: run git archive from repo root * Fix: serve current docs at /docs, versioned docs at /docs/v1.11 * fix(versioning): only show unmaintained banner for versions outside maintenance window * Update to 1 * feat(versioning): add enterprise support note on unmaintained version banner * feat(versioning): add support for unreleased docs from trunk - Current docs (trunk) now served at /docs/next with 'unreleased' banner - Latest release branch served at /docs as the default - Previous versions continue at /docs/v1.11, etc. * Move to right * feat(versioning): auto-detect release branches from git - Script now auto-detects release/<major>.<minor> branches - No manual VERSIONS array maintenance required - Creating a new release branch automatically adds version to docs * Fix * fix(versioning): highest version is 'next', second highest is 'latest' - v1.11.x (highest) → Next (unreleased) at /docs/next - v1.10.x (second) → Latest (stable) at /docs - Trunk docs available at /docs/trunk - Previous versions at /docs/v1.9, etc. * fix(versioning): warn on broken links for cross-version compatibility Older release branches may contain absolute links to docs pages that don't exist in all versions. Setting onBrokenLinks to 'warn' allows the build to succeed while still reporting these issues. * Fixes * Fixes * Update menu * Improve SMB docs (#1328) * Improve copilot instructions * Improve SMB docs * Docs for snapshots_creation_policy (#1330) * Improvements (#1333) * Add High Availability documentation for distributed query clusters (#1334) * Update kafka/debezium docs * Update snapshot storage configuration description (#1331) * v1.11 Documentation (#1296) * Remove OTel ports from docs (#1270) * Docs for Google (#1286) Co-authored-by: Evgenii Khramkov <evgenii@spice.ai> Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com> * Parameterized Queries docs (#1298) * Paramterized Queries docs * Formatting * Add SMB and NFS Data Connector Docs (#1295) * Add SMB & NFS Data Connector docs * Fixes * formatting * Rename "params" key (#1300) * Fix params (#1301) * Update Dynamodb Authentication (#1304) * Update Dynamodb Auth * Update * Cayenne: document cayenne_file_path and cayenne_metadata_dir (#1307) * Update snapshots documentation (#1318) * Update snapshots documentation * Fix * Docs for snapshots_reset_expiry_on_load (#1322) * Minor fixes for DynamoDB (#1323) * Minor fixes for DynamoDB * Minor fix * Update Distributed Query docs for v1.11 changes (#1326) * Update Distributed Query docs for v1.11 changes * Update website/docs/features/distributed-query/index.md Co-authored-by: Jack Eadie <jack@spice.ai> --------- Co-authored-by: Jack Eadie <jack@spice.ai> * ScyllaDB Data Connector docs (#1325) * Add Arrow Hash Index docs (#1324) * Add Arrow Hash Index docs * Formatting * Add versioning support (#1308) * Add versioning support * Fix: empty versions array until release branches exist * Enable versioned docs for release/1.11 branch * Fix: resolve git refs with origin/ prefix for CI * Fix: run git archive from repo root * Fix: serve current docs at /docs, versioned docs at /docs/v1.11 * fix(versioning): only show unmaintained banner for versions outside maintenance window * Update to 1 * feat(versioning): add enterprise support note on unmaintained version banner * feat(versioning): add support for unreleased docs from trunk - Current docs (trunk) now served at /docs/next with 'unreleased' banner - Latest release branch served at /docs as the default - Previous versions continue at /docs/v1.11, etc. * Move to right * feat(versioning): auto-detect release branches from git - Script now auto-detects release/<major>.<minor> branches - No manual VERSIONS array maintenance required - Creating a new release branch automatically adds version to docs * Fix * fix(versioning): highest version is 'next', second highest is 'latest' - v1.11.x (highest) → Next (unreleased) at /docs/next - v1.10.x (second) → Latest (stable) at /docs - Trunk docs available at /docs/trunk - Previous versions at /docs/v1.9, etc. * fix(versioning): warn on broken links for cross-version compatibility Older release branches may contain absolute links to docs pages that don't exist in all versions. Setting onBrokenLinks to 'warn' allows the build to succeed while still reporting these issues. * Fixes * Fixes * Update menu * Improve SMB docs (#1328) * Improve copilot instructions * Improve SMB docs --------- Co-authored-by: Phillip LeBlanc <phillip@spice.ai> Co-authored-by: Jack Eadie <jack@spice.ai> Co-authored-by: Evgenii Khramkov <evgenii@spice.ai> Co-authored-by: Viktor Yershov <viktor@spice.ai> Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com> * Update snapshot storage configuration description Clarified that the location for snapshots must be an S3 directory instead of a folder. --------- Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com> Co-authored-by: Phillip LeBlanc <phillip@spice.ai> Co-authored-by: Evgenii Khramkov <evgenii@spice.ai> Co-authored-by: Viktor Yershov <viktor@spice.ai> Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com> * Update Snowflake data connector docs (#1340) * fix: convert absolute /docs/ links to relative paths * Update Cayenne accelerator status from Alpha to Beta (#1341) * URL tables docs (#1343) * URL tables docs * Formatting * Improve snapshots documentation + retention (#1342) * Improve snapshots documnetation + retention * Fix --------- Co-authored-by: Jack Eadie <jack@spice.ai> * Fix links * Formatting * Update version references in documentation and scripts for v2.1 release * Handle major versions * Fix merge conflicts --------- Co-authored-by: Phillip LeBlanc <phillip@spice.ai> Co-authored-by: Jack Eadie <jack@spice.ai> Co-authored-by: Evgenii Khramkov <evgenii@spice.ai> Co-authored-by: Viktor Yershov <viktor@spice.ai> Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com>
2 parents c7a5ce5 + dfa14fb commit 166dced

15 files changed

Lines changed: 570 additions & 232 deletions

File tree

website/README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -40,21 +40,21 @@ The documentation supports multiple versions to maintain docs for different rele
4040

4141
1. **Current docs** (`docs/`) — Working documentation from trunk, served at `/docs/trunk`
4242
2. **Versioned docs** — Auto-generated at build time from `release/<major>.<minor>` branches
43-
- Highest version (e.g., v1.11.x) → "Next" (unreleased) at `/docs/next`
44-
- Second highest (e.g., v1.10.x) → "Latest" (stable) at `/docs`
45-
- Previous versions → at `/docs/v1.9`, etc.
43+
- Highest version (e.g., v2.0.x) → "Next" (unreleased) at `/docs/next`
44+
- Second highest (e.g., v1.11.x) → "Latest" (stable) at `/docs`
45+
- Previous versions → at `/docs/v1.10`, etc.
4646

4747
The version generation script ([scripts/generate-versions.sh](scripts/generate-versions.sh)) auto-detects release branches and uses `git archive` to extract docs from each without checking out the full repository.
4848

4949
### Creating a new version for a release
5050

51-
When releasing a new version (e.g., v1.12):
51+
When releasing a new version (e.g., v2.1):
5252

5353
1. **Create a release branch** for the new version:
5454

5555
```bash
56-
git checkout -b release/1.12
57-
git push origin release/1.12
56+
git checkout -b release/2.1
57+
git push origin release/2.1
5858
```
5959

6060
2. **That's it!** The build script auto-detects release branches matching the `release/<major>.<minor>` pattern. The next build will automatically include the new version.
@@ -71,17 +71,17 @@ When releasing a new version (e.g., v1.12):
7171
- `/docs` — Latest stable release (default)
7272
- `/docs/next` — Next release (unreleased, highest version branch)
7373
- `/docs/trunk` — Working docs from trunk
74-
- `/docs/v1.9` — Previous release versions
74+
- `/docs/v1.10` — Previous release versions
7575

7676
### Updating existing version docs
7777

7878
To update docs for a released version, push changes directly to the corresponding release branch:
7979

8080
```bash
81-
git checkout release/1.12
81+
git checkout release/2.1
8282
# Make changes
83-
git commit -m "Update docs for v1.12.x"
84-
git push origin release/1.12
83+
git commit -m "Update docs for v2.1.x"
84+
git push origin release/2.1
8585
```
8686

8787
The next build will pick up the updated docs from the release branch.

website/docs/components/data-accelerators/cayenne.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@ tags:
1111
- s3-express
1212
---
1313

14-
:::info Alpha
15-
The Spice Cayenne Data Accelerator is in Alpha. Features and configuration may change. Available in Spice v1.9.0-rc.1 and later.
14+
:::info Beta
15+
The Spice Cayenne Data Accelerator is in Beta.
1616
:::
1717

1818
Spice Cayenne is a data acceleration engine designed for high-performance, scalable query on large-scale datasets. Built on [Vortex](https://github.com/vortex-data/vortex), a next-generation columnar file format, Spice Cayenne combines columnar storage with in-process metadata management to provide fast query performance to scale to datasets beyond 1TB.
@@ -504,7 +504,7 @@ Query performance scales with available CPU cores. Vortex's columnar format supp
504504

505505
Consider the following limitations when using Spice Cayenne acceleration:
506506

507-
- **Alpha Status**: Spice Cayenne is in active development. Configuration options may change between releases.
507+
- **Beta Status**: Spice Cayenne is in active development. Configuration options may change between releases.
508508
- **File Mode Only**: Spice Cayenne only supports `mode: file` and does not support in-memory (`mode: memory`) acceleration.
509509
- **No Snapshot Support**: Spice Cayenne does not yet support [acceleration snapshots](../../features/data-acceleration/snapshots) for bootstrapping from object storage.
510510
- **S3 Express Only**: Standard S3 buckets are not supported for remote storage. Only S3 Express One Zone directory buckets are supported.
@@ -513,8 +513,8 @@ Consider the following limitations when using Spice Cayenne acceleration:
513513
- **No MVCC**: Multi-version concurrency control is not yet implemented. Snapshots and time-travel queries are planned for future releases.
514514
- **No File Compaction**: Automatic file compaction to reclaim space from deleted rows is not yet available.
515515

516-
:::warning ALPHA SOFTWARE
517-
As an Alpha feature, Spice Cayenne should be thoroughly tested in development environments before production deployment. Monitor release notes for updates, breaking changes, and new capabilities.
516+
:::warning BETA SOFTWARE
517+
As a Beta feature, Spice Cayenne should be thoroughly tested in development environments before production deployment. Monitor release notes for updates, breaking changes, and new capabilities.
518518
:::
519519

520520
## Example Spicepod

website/docs/components/data-accelerators/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ By default, datasets are locally materialized using in-memory Arrow records.
3030
| Name | Description | Status | Engine Modes |
3131
| ---------- | ------------------------------- | -------------------- | ---------------- |
3232
| `arrow` | In-Memory Arrow Records | Stable | `memory` |
33-
| `cayenne` | [Spice Cayenne][cayenne] | Alpha (v1.9.0-rc.1+) | `file` |
33+
| `cayenne` | [Spice Cayenne][cayenne] | Beta | `file` |
3434
| `duckdb` | Embedded [DuckDB][duckdb] | Stable | `memory`, `file` |
3535
| `postgres` | Attached [PostgreSQL][postgres] | Release Candidate | N/A |
3636
| `sqlite` | Embedded [SQLite][sqlite] | Release Candidate | `memory`, `file` |

website/docs/components/data-connectors/debezium.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,24 @@ datasets:
3535
mode: file # Persistence is recommended to not have to rebuild the table each time Spice starts.
3636
```
3737
38+
## Overview
39+
40+
Upon startup, Spice subscribes to the specified Debezium-managed Kafka topic using either a uniquely generated consumer group or a custom one specified via `kafka_consumer_group_id`. If a persistent acceleration engine is used (with `mode: file`), data is fetched starting from the last processed record, allowing Spice to resume without reprocessing all historical change events.
41+
42+
## Consumer Group Management
43+
44+
The Debezium connector manages consumer groups to ensure data consistency across restarts. Offsets are committed to Kafka, allowing Spice to track consumption progress.
45+
46+
**Default behavior:** When no `kafka_consumer_group_id` is specified, Spice automatically generates a unique consumer group ID and stores it in the acceleration metadata. On subsequent restarts, Spice retrieves and reuses this stored consumer group ID to maintain offset tracking and resume consumption from where it left off.
47+
48+
**Custom consumer group:** If you specify a custom `kafka_consumer_group_id`, Spice stores this ID in the acceleration metadata. The same consumer group must be used on subsequent restarts. If no acceleration data exists and a custom consumer group is provided, Spice will reset its position to the oldest available offset and begin consuming from the start of the topic.
49+
50+
**Consumer group mismatch error:** Spice will return an error if a restart is attempted with a different consumer group than what is stored in the acceleration metadata. This applies to both auto-generated and custom consumer group IDs. This safeguard prevents data inconsistency that could occur from mixing offsets between different consumer groups.
51+
52+
To resolve a consumer group mismatch, either:
53+
- Use the same consumer group ID as stored in the acceleration
54+
- Reset the acceleration data to start fresh with a new consumer group
55+
3856
## Configuration
3957

4058
### `from`
@@ -79,7 +97,7 @@ The dataset name cannot be a [reserved keyword](../../reference/spicepod/keyword
7997
| `kafka_ssl_ca_location` | Path to the SSL/TLS CA certificate file for server verification. |
8098
| `kafka_enable_ssl_certificate_verification` | Enable SSL/TLS certificate verification. Default: `true`. |
8199
| `kafka_ssl_endpoint_identification_algorithm` | SSL/TLS endpoint identification algorithm. Default: `https`. Options: <ul><li>`none`</li><li>`https`</li></ul> |
82-
| `kafka_consumer_group_id` | Kafka consumer group id to use. If not set, a unique id will be generated. |
100+
| `kafka_consumer_group_id` | Kafka consumer group ID to use. If not set, a unique ID will be generated automatically. The consumer group ID (whether auto-generated or custom) is stored in the acceleration metadata and must remain consistent across restarts. See [Consumer Group Management](#consumer-group-management) for details. |
83101

84102
### `metrics`
85103

website/docs/components/data-connectors/dynamodb.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -680,6 +680,12 @@ datasets:
680680
- name: errors_transient_total
681681
```
682682

683+
:::warning[Limitations]
684+
685+
- DynamoDB Streams connector does not support `refresh_sql`.
686+
687+
:::
688+
683689
## Cookbooks
684690

685691
- A cookbook recipe to configure DynamoDB as a data connector in Spice. [DynamoDB Data Connector](https://github.com/spiceai/cookbook/tree/trunk/dynamodb#readme)

website/docs/components/data-connectors/glue.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -50,13 +50,13 @@ The dataset name cannot be a [reserved keyword](../../reference/spicepod/keyword
5050

5151
The following parameters are supported for configuring the connection to the Glue Data Catalog:
5252

53-
| Parameter Name | Definition |
54-
| -------------------- | --------------------------------------------------------------------------- |
55-
| `glue_region` | The AWS region for the Glue Data Catalog. E.g. `us-west-2`. |
53+
| Parameter Name | Definition |
54+
| -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
55+
| `glue_region` | The AWS region for the Glue Data Catalog. E.g. `us-west-2`. |
5656
| `glue_catalog_id` | The Glue catalog ID. For Amazon S3 Tables, use the format `<account_id>:s3tablescatalog/<table_bucket_name>`. If not provided, the default catalog for the account is used. |
57-
| `glue_key` | Access key (e.g. AWS_ACCESS_KEY_ID for AWS). If not provided, credentials will be loaded from environment variables or IAM roles. |
58-
| `glue_secret` | Secret key (e.g. AWS_SECRET_ACCESS_KEY for AWS). If not provided, credentials will be loaded from environment variables or IAM roles. |
59-
| `glue_session_token` | Session token (e.g. AWS_SESSION_TOKEN for AWS) for temporary credentials |
57+
| `glue_key` | Access key (e.g. AWS_ACCESS_KEY_ID for AWS). If not provided, credentials will be loaded from environment variables or IAM roles. |
58+
| `glue_secret` | Secret key (e.g. AWS_SECRET_ACCESS_KEY for AWS). If not provided, credentials will be loaded from environment variables or IAM roles. |
59+
| `glue_session_token` | Session token (e.g. AWS_SESSION_TOKEN for AWS) for temporary credentials |
6060

6161
## Examples
6262

@@ -180,15 +180,15 @@ The IAM role or user needs the following permissions to access Iceberg tables in
180180

181181
### Permission Details
182182

183-
| Permission | Purpose |
184-
|------------|---------|
185-
| `s3:ListBucket` | Required. Allows scanning all objects from the bucket |
186-
| `s3:GetObject` | Required. Allows fetching objects |
187-
| `glue:GetCatalog` | Required. Retrieve metadata about the specified catalog. |
183+
| Permission | Purpose |
184+
| ------------------- | -------------------------------------------------------------- |
185+
| `s3:ListBucket` | Required. Allows scanning all objects from the bucket |
186+
| `s3:GetObject` | Required. Allows fetching objects |
187+
| `glue:GetCatalog` | Required. Retrieve metadata about the specified catalog. |
188188
| `glue:GetDatabases` | Required. List the databases available in the current catalog. |
189-
| `glue:GetDatabase` | Required. Retrieve metadata about the specified database. |
190-
| `glue:GetTable` | Required. Retrieve metadata about the specified table. |
191-
| `glue:GetTables` | Required. List the tables available in the current database. |
189+
| `glue:GetDatabase` | Required. Retrieve metadata about the specified database. |
190+
| `glue:GetTable` | Required. Retrieve metadata about the specified table. |
191+
| `glue:GetTables` | Required. List the tables available in the current database. |
192192

193193
## Limitations
194194

website/docs/components/data-connectors/kafka.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,10 +35,24 @@ datasets:
3535
3636
## Overview
3737
38-
Upon startup, Spice fetches all messages for the specified topic using a uniquely generated consumer group. If a persistent acceleration engine is used (with `mode: file`), data is fetched starting from the last processed record, allowing Spice to resume without reprocessing all historical data.
38+
Upon startup, Spice subscribes to the specified topic using either a uniquely generated consumer group or a custom one specified via `kafka_consumer_group_id`. If a persistent acceleration engine is used (with `mode: file`), data is fetched starting from the last processed record, allowing Spice to resume without reprocessing all historical data.
3939

4040
Schema is automatically inferred from the first available topic message in JSON format. The connector creates the appropriate table schema for acceleration based on the detected data structure.
4141

42+
## Consumer Group Management
43+
44+
The Kafka connector manages consumer groups to ensure data consistency across restarts. Offsets are committed to Kafka, allowing Spice to track consumption progress.
45+
46+
**Default behavior:** When no `kafka_consumer_group_id` is specified, Spice automatically generates a unique consumer group ID and stores it in the acceleration metadata. On subsequent restarts, Spice retrieves and reuses this stored consumer group ID to maintain offset tracking and resume consumption from where it left off.
47+
48+
**Custom consumer group:** If you specify a custom `kafka_consumer_group_id`, Spice stores this ID in the acceleration metadata. The same consumer group must be used on subsequent restarts. If no acceleration data exists and a custom consumer group is provided, Spice will reset its position to the oldest available offset and begin consuming from the start of the topic.
49+
50+
**Consumer group mismatch error:** Spice will return an error if a restart is attempted with a different consumer group than what is stored in the acceleration metadata. This applies to both auto-generated and custom consumer group IDs. This safeguard prevents data inconsistency that could occur from mixing offsets between different consumer groups.
51+
52+
To resolve a consumer group mismatch, either:
53+
- Use the same consumer group ID as stored in the acceleration
54+
- Reset the acceleration data to start fresh with a new consumer group
55+
4256
## Configuration
4357

4458
### `from`

0 commit comments

Comments
 (0)