Skip to content

Commit 6d5a287

Browse files
claudespicelukekim
authored andcommitted
docs: Document refresh_mode: snapshot
Add documentation for the new snapshot refresh mode that creates read-only accelerations driven exclusively from the snapshot store. Also remove the Cayenne single-dataset-per-spicepod snapshot limitation, which was lifted by per-dataset metastore slices. Source: spiceai/spiceai#10651
1 parent 083fef1 commit 6d5a287

3 files changed

Lines changed: 53 additions & 10 deletions

File tree

website/docs/features/data-acceleration/data-refresh.md

Lines changed: 50 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -21,14 +21,15 @@ Acceleration data can be refreshed (updated) by:
2121

2222
## Refresh Modes
2323

24-
Spice supports four modes to refresh/update local data from a connected data source. `full` is the default mode.
24+
Spice supports five modes to refresh/update local data from a connected data source. `full` is the default mode.
2525

26-
| Mode | Description | Example |
27-
| --------- | ---------------------------------------------------- | ---------------------------------------------------------------- |
28-
| `full` | Replace/overwrite the entire dataset on each refresh | A table of users |
29-
| `append` | Append/add data to the dataset on each refresh | Append-only, immutable datasets, such as time-series or log data |
30-
| `changes` | Apply incremental changes | Customer order lifecycle table |
31-
| `caching` | Read-through caching for SQL queries | API search results or dynamic content endpoints |
26+
| Mode | Description | Example |
27+
| ---------- | ---------------------------------------------------- | ---------------------------------------------------------------- |
28+
| `full` | Replace/overwrite the entire dataset on each refresh | A table of users |
29+
| `append` | Append/add data to the dataset on each refresh | Append-only, immutable datasets, such as time-series or log data |
30+
| `changes` | Apply incremental changes | Customer order lifecycle table |
31+
| `caching` | Read-through caching for SQL queries | API search results or dynamic content endpoints |
32+
| `snapshot` | Reload exclusively from the snapshot store | Read-only replicas bootstrapped from centralized snapshots |
3233

3334
Learn more about each mode:
3435

@@ -136,6 +137,48 @@ The `caching` refresh mode is designed for HTTP-based datasets where request met
136137

137138
See [Caching Mode](./refresh-modes/caching) for detailed documentation and examples.
138139

140+
### Snapshot
141+
142+
The `snapshot` refresh mode creates a read-only acceleration that reloads exclusively from the [snapshot store](./snapshots). The federated data source is never queried for refreshes — instead, the runtime polls the snapshot store on a configurable interval and atomically swaps in newer snapshots when available.
143+
144+
```yaml
145+
snapshots:
146+
enabled: true
147+
location: s3://my-bucket/snapshots/
148+
params:
149+
s3_auth: iam_role
150+
151+
datasets:
152+
- from: postgres:public.my_table
153+
name: my_table
154+
acceleration:
155+
enabled: true
156+
engine: duckdb
157+
mode: file
158+
refresh_mode: snapshot
159+
refresh_check_interval: 30s # Poll interval; defaults to 1m
160+
snapshots: enabled
161+
params:
162+
duckdb_file: /nvme/my_table.db
163+
```
164+
165+
**Requirements:**
166+
167+
- `acceleration.snapshots` must be `enabled` or `bootstrap_only`
168+
- The acceleration engine must be a snapshot-capable file-based engine: **DuckDB**, **SQLite**, **Cayenne**, or **Turso**
169+
170+
**Behavior:**
171+
172+
- On startup, the runtime bootstraps from the most recent snapshot (same as other snapshot-enabled modes)
173+
- After bootstrap, the runtime polls the snapshot store at `refresh_check_interval` (default: 60 seconds) for newer snapshots
174+
- When a newer snapshot is found, its schema is validated against the current acceleration schema before downloading
175+
- The accelerator file is swapped atomically — queries continue to be served from the previous snapshot until the swap completes
176+
- `INSERT INTO` statements are rejected with an error since the acceleration is driven exclusively from snapshots
177+
178+
:::tip
179+
Use `refresh_mode: snapshot` for read-only replicas that don't need direct access to the federated source — for example, edge nodes that receive snapshots from a centralized writer.
180+
:::
181+
139182
## Ready State
140183

141184
| | |

website/docs/features/data-acceleration/snapshots.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -290,6 +290,5 @@ For the full reference, see [`snapshots` in the Spicepod specification](../../re
290290
:::warning[Limitations]
291291

292292
- Only datasets are supported for snapshots. Views are not supported.
293-
- When using Cayenne accelerations, snapshots are supported only when one dataset is configured per spicepod.
294293

295294
:::

website/docs/reference/spicepod/datasets.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ datasets:
2323
mode: memory # / file
2424
engine: arrow # / cayenne / duckdb / sqlite / postgres / turso
2525
refresh_check_interval: 1h
26-
refresh_mode: full / append # / changes / caching
26+
refresh_mode: full / append # / changes / caching / snapshot
2727
```
2828
2929
`spicepod.yaml`
@@ -39,7 +39,7 @@ datasets:
3939
mode: memory # / file
4040
engine: arrow # / cayenne / duckdb / sqlite / postgres / turso
4141
refresh_check_interval: 1h
42-
refresh_mode: full / append # / changes / caching
42+
refresh_mode: full / append # / changes / caching / snapshot
4343
```
4444

4545
Relative path example:
@@ -436,6 +436,7 @@ Optional. How to refresh the dataset. The following values are supported:
436436
- `append` - Append new data to the dataset. When `time_column` is specified, new records are fetched from the latest timestamp in the accelerated data at the `acceleration.refresh_check_interval`.
437437
- `changes` - Apply change data capture (CDC) events to incrementally update the dataset.
438438
- `caching` - Cache data based on request metadata (HTTP requests). Uses row-level replacement based on cache keys. See [Caching Mode](../../features/data-acceleration/refresh-modes/caching) for details.
439+
- `snapshot` - Reload exclusively from the [snapshot store](../../features/data-acceleration/snapshots). The federated source is never queried; the runtime polls for newer snapshots at `refresh_check_interval` (default: 60s). Requires `acceleration.snapshots: enabled` or `bootstrap_only` and a snapshot-capable file-based engine (DuckDB, SQLite, Cayenne, or Turso). Writes (`INSERT INTO`) are rejected. See [Snapshot Refresh Mode](../../features/data-acceleration/data-refresh#snapshot).
439440

440441
## `acceleration.refresh_check_interval`
441442

0 commit comments

Comments
 (0)