Skip to content

Commit ef08ed1

Browse files
committed
docs: Document acceleration.storage_profile config
Add a new reference section for `acceleration.storage_profile` covering the `auto` / `local_ssd` / `ebs` / `tmpfs` values and their aliases. This field tunes connection pool sizing, checkpoint thresholds, and per-file size defaults for file-mode accelerators (DuckDB, SQLite, Turso, and Spice Cayenne) based on the underlying storage medium. Source: spiceai/spiceai#10913
1 parent 10d5d9f commit ef08ed1

1 file changed

Lines changed: 25 additions & 0 deletions

File tree

website/docs/reference/spicepod/datasets.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -365,6 +365,31 @@ Optional. The mode of acceleration. The following values are supported:
365365
- `file_create` - Always create a new acceleration file on startup, removing any existing file. When [snapshots](../../features/data-acceleration/snapshots) are enabled, the existing file is snapshotted before deletion. Supported for Spice Cayenne (`cayenne`), `duckdb`, `sqlite`, and `turso` acceleration engines.
366366
- `file_update` - Open an existing acceleration file if it exists, then check schema compatibility on refresh. If the source schema change is additive (new columns only), the existing file is kept. If the schema change is incompatible (columns removed, renamed, or type changed), the file is snapshotted (if [snapshots](../../features/data-acceleration/snapshots) are enabled) and recreated from scratch. Supported for Spice Cayenne (`cayenne`), `duckdb`, `sqlite`, and `turso` acceleration engines.
367367

368+
## `acceleration.storage_profile`
369+
370+
Optional. The storage profile for file-backed acceleration. The runtime uses this hint to tune connection-pool sizing, checkpoint thresholds, and file-size defaults for the underlying medium. Only applies to file-mode accelerators (`duckdb`, `sqlite`, `turso`, and Spice Cayenne); memory-mode accelerators ignore this setting.
371+
372+
Supported values:
373+
374+
- `auto` (default) – Detect the storage profile from the resolved acceleration file path. On Linux, detection reads `/proc/self/mountinfo` and inspects block-device metadata to recognize Amazon EBS, Azure Managed Disks, Amazon EC2 NVMe instance storage, `tmpfs`/`ramfs`, and generic NVMe/SSD. On other platforms, detection returns unknown and the engine defaults apply.
375+
- `local_ssd` (aliases: `ssd`, `nvme`) – Treat the acceleration file location as local SSD/NVMe (for example, EC2 instance store or Azure temporary/NVMe local storage). Uses the engine defaults for connection pool size and checkpoint thresholds.
376+
- `ebs` (aliases: `azure_disk`, `managed_disk`, `network_disk`) – Treat the acceleration file location as network-attached block storage (for example, Amazon EBS or Azure Managed Disks). Reduces connection-pool size and raises DuckDB's checkpoint threshold so per-IO latency is amortized across larger flushes. Spice Cayenne uses smaller per-file targets to reduce write amplification.
377+
- `tmpfs` (aliases: `ram`, `ramdisk`, `ramfs`, `memory`) – Treat the acceleration file location as RAM-backed storage. Raises DuckDB's checkpoint threshold so steady-state workloads don't pay checkpoint cost on small amounts of dirty data; Spice Cayenne uses larger per-file targets to improve scan throughput.
378+
379+
Example:
380+
381+
```yaml
382+
datasets:
383+
- from: s3://bucket/data/
384+
name: analytics
385+
acceleration:
386+
engine: duckdb
387+
mode: file
388+
storage_profile: ebs
389+
params:
390+
duckdb_file: /mnt/ebs/analytics.db
391+
```
392+
368393
## `acceleration.snapshots`
369394

370395
Optional. Controls how this dataset participates in managed acceleration snapshots. Requires the Spicepod to configure the top-level [`snapshots` block](.#snapshots), the acceleration engine to be `duckdb`, `sqlite`, `cayenne`, or `turso`, and `mode: file` with a dataset-specific file path (for example `acceleration.params.duckdb_file: /nvme/my_dataset.db`).

0 commit comments

Comments
 (0)