Works with v1.0+
Create a connector instance using sample data and accelerate it using In-Memory Arrow Data Accelerator.
- Spice CLI installed (see Getting Started).
Step 1. Initialize a new Spice app.
spice init arrow-acceleration-qs
cd arrow-acceleration-qsStep 2. Configure s3 dataset: copy and paste the YAML below to spicepod.yaml in the Spice app.
version: v1
kind: Spicepod
name: arrow-acceleration-qs
datasets:
- from: s3://spiceai-demo-datasets/taxi_trips/2024/
name: taxi_trips
description: taxi trips in s3
params:
file_format: parquetStep 3. Start the Spice runtime.
spice runConfirm in the terminal output the taxi_trips dataset has been loaded:
2025/07/14 08:50:13 INFO Checking for latest Spice runtime release...
2025/07/14 08:50:15 INFO Spice.ai runtime starting...
2025-07-14T15:50:15.370061Z INFO runtime::init::caching: Initialized results cache; max size: 128.00 MiB, item ttl: 1s
2025-07-14T15:50:15.370199Z INFO runtime::init::caching: Initialized search results cache;
2025-07-14T15:50:15.732242Z INFO runtime::flight: Spice Runtime Flight listening on 127.0.0.1:50051
2025-07-14T15:50:15.732235Z INFO runtime::opentelemetry: Spice Runtime OpenTelemetry listening on 127.0.0.1:50052
2025-07-14T15:50:15.734062Z INFO runtime::init::dataset: Initializing dataset taxi_trips
2025-07-14T15:50:15.738931Z INFO runtime::http: Spice Runtime HTTP listening on 127.0.0.1:8090
2025-07-14T15:50:16.608896Z INFO runtime::init::dataset: Dataset taxi_trips registered (s3://spiceai-demo-datasets/taxi_trips/2024/), acceleration (arrow), results cache enabled.
2025-07-14T15:50:16.610030Z INFO runtime::accelerated_table::refresh_task: Loading data for dataset taxi_trips
2025-07-14T15:50:29.423673Z INFO runtime::accelerated_table::refresh_task: Loaded 2,964,624 rows (399.41 MiB) for dataset taxi_trips in 12s 813ms.
2025-07-14T15:50:29.494757Z INFO runtime: All components are loaded. Spice runtime is ready!Step 4. Run queries against the dataset using the Spice SQL REPL.
In a new terminal, start the Spice SQL REPL
spice sqlQuery the taxi_trips dataset, observing the long query time.
select "VendorID", tpep_pickup_datetime, tpep_dropoff_datetime, passenger_count from taxi_trips limit 10;+----------+----------------------+-----------------------+-----------------+
| VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count |
+----------+----------------------+-----------------------+-----------------+
| 2 | 2024-01-29T19:28:41 | 2024-01-29T19:36:46 | 2 |
| 1 | 2024-01-29T19:22:21 | 2024-01-29T19:28:45 | 2 |
| 1 | 2024-01-29T19:50:24 | 2024-01-29T20:09:21 | 2 |
| 1 | 2024-01-29T19:43:52 | 2024-01-29T20:01:40 | 2 |
| 1 | 2024-01-29T19:09:57 | 2024-01-29T19:55:36 | 2 |
| 1 | 2024-01-29T19:51:28 | 2024-01-29T20:09:16 | 2 |
| 1 | 2024-01-29T19:23:46 | 2024-01-29T19:31:06 | 2 |
| 2 | 2024-01-29T19:01:27 | 2024-01-29T19:09:07 | 2 |
| 1 | 2024-01-29T19:13:53 | 2024-01-29T19:23:09 | 2 |
| 1 | 2024-01-29T19:53:55 | 2024-01-29T20:06:56 | 2 |
+----------+----------------------+-----------------------+-----------------+
Time: 1.081530375 seconds. 10 rows.Step 5. Update the spicepod.yaml to enable In-Memory Arrow acceleration.
version: v1
kind: Spicepod
name: arrow-acceleration-qs
datasets:
- from: s3://spiceai-demo-datasets/taxi_trips/2024/
name: taxi_trips
description: taxi trips in s3
params:
file_format: parquet
acceleration:
enabled: trueStep 6. Save the changes in Spice app and observe the dataset updating and accelerating.
2024-10-22T19:28:24.204608Z INFO runtime: Updating accelerated dataset taxi_trips...
2024-10-22T19:28:25.202828Z INFO runtime::accelerated_table::refresh_task: Loading data for dataset taxi_trips
2024-10-22T19:29:07.729346Z INFO runtime::accelerated_table::refresh_task: Loaded 2,964,624 rows (398.86 MiB) for dataset taxi_trips in 42s 525ms.
2024-10-22T19:29:09.217425Z INFO runtime: Dataset taxi_trips registered (s3://spiceai-demo-datasets/taxi_trips/2024/), acceleration (arrow), results cache enabled.Step 7. Run a query against the taxi_trips dataset again, observing the fast query time.
select "VendorID", tpep_pickup_datetime, tpep_dropoff_datetime, passenger_count from taxi_trips limit 10;+----------+----------------------+-----------------------+-----------------+
| VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count |
+----------+----------------------+-----------------------+-----------------+
| 1 | 2024-01-12T19:14:53 | 2024-01-12T19:28:34 | 2 |
| 2 | 2024-01-12T19:03:12 | 2024-01-12T19:17:19 | 2 |
| 2 | 2024-01-12T19:34:22 | 2024-01-12T19:38:11 | 2 |
| 2 | 2024-01-12T19:44:51 | 2024-01-12T19:49:40 | 2 |
| 1 | 2024-01-12T19:31:54 | 2024-01-12T19:38:43 | 2 |
| 2 | 2024-01-12T19:54:37 | 2024-01-12T19:59:55 | 2 |
| 1 | 2024-01-12T19:02:32 | 2024-01-12T19:12:25 | 2 |
| 1 | 2024-01-12T19:22:38 | 2024-01-12T19:37:30 | 2 |
| 2 | 2024-01-12T19:34:30 | 2024-01-12T20:31:18 | 2 |
| 2 | 2024-01-12T19:54:06 | 2024-01-12T20:07:29 | 2 |
+----------+----------------------+-----------------------+-----------------+
Time: 0.004575584 seconds. 10 rows.-
For using
spice sql, see the CLI reference. -
See the datasets reference for additional dataset configuration options.