icon | description |
---|---|
bolt |
Configure local acceleration for datasets in Spice for faster queries (test) |
Datasets can be locally accelerated by the Spice runtime, pulling data from any Data Connector and storing it locally in a Data Accelerator for faster access. The data can be kept up-to-date in real-time or on a refresh schedule, ensuring users always have the latest data locally for querying.
Dataset acceleration is enabled by setting the acceleration
configuration. Spice currently supports In-Memory Arrow, DuckDB, SQLite, PostgreSQL as accelerators. For engine specific configuration, see Data Accelerator Documentation
datasets:
- from: spice.ai/spiceai/quickstart/datasets/taxi_trips
name: taxi_trips
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 10s
Spice supports three modes to refresh/update locally accelerated data from a connected data source. full
is the default mode. Refer to Data Refresh documentation for detailed refresh usage and configuration.
Mode | Description | Example |
---|---|---|
full |
Replace/overwrite the entire dataset on each refresh | A table of users |
append |
Append/add data to the dataset on each refresh | Append-only, immutable datasets, such as time-series or log data |
changes |
Apply incremental changes | Customer order lifecycle table |
datasets:
- from: databricks:my_dataset
name: accelerated_dataset
acceleration:
refresh_mode: full
refresh_check_interval: 10m
Database indexes are essential for optimizing query performance. Configure indexes for accelerators via indexes
field. For detailed configuration, refer to the index documentation.
datasets:
- from: spice.ai/eth.recent_blocks
name: eth.recent_blocks
acceleration:
enabled: true
engine: sqlite
indexes:
number: enabled # Index the `number` column
'(hash, timestamp)': unique # Add a unique index with a multicolumn key comprised of the `hash` and `timestamp` columns
Constraints enforce data integrity in a database. Spice supports constraints on locally accelerated tables to ensure data quality and configure behavior for data updates that violate constraints.
Constraints are specified using column references in the Spicepod via the primary_key
field in the acceleration configuration. Additional unique constraints are specified via the indexes
field with the value unique
. Data that violates these constraints will result in a conflict. For constraints configuration details, visit Constraints Documentation.
datasets:
- from: spice.ai/eth.recent_blocks
name: eth.recent_blocks
acceleration:
enabled: true
engine: sqlite
primary_key: hash # Define a primary key on the `hash` column
indexes:
'(number, timestamp)': unique # Add a unique index with a multicolumn key comprised of the `number` and `timestamp` columns