Skip to content

Commit 0283a3c

Browse files
phillipleblanclukekimpeasee
authored
v1.9.0 docs (#1237)
* Results cache docs for zstd compression (#1234) * Results cache docs for zstd compression * Add limitation * docs: Fix reference to stale while revalidate ttl param --------- Co-authored-by: peasee <98815791+peasee@users.noreply.github.com> Co-authored-by: Phillip LeBlanc <phillip@spice.ai> * Add docs for distributed query (#1236) * Add docs page for distributed query * Clarify the spicepod requirements * tweaks * Update website/docs/features/distributed-query/index.md Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com> --------- Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com> --------- Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com> Co-authored-by: peasee <98815791+peasee@users.noreply.github.com>
1 parent 6692db6 commit 0283a3c

12 files changed

Lines changed: 177 additions & 47 deletions

File tree

website/docs/features/caching/index.md

Lines changed: 96 additions & 36 deletions
Large diffs are not rendered by default.

website/docs/features/cdc/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Change Data Capture (CDC)'
33
sidebar_label: 'Change Data Capture'
44
description: 'Learn how to use Change Data Capture (CDC) in Spice.'
5-
sidebar_position: 4
5+
sidebar_position: 5
66
pagination_prev: null
77
pagination_next: null
88
---

website/docs/features/data-ingestion/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Data Ingestion'
33
sidebar_label: 'Data Ingestion'
44
description: 'Learn how to ingest data in Spice.'
5-
sidebar_position: 5
5+
sidebar_position: 6
66
pagination_prev: null
77
pagination_next: null
88
tags:
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
---
2+
title: 'Distributed Query'
3+
sidebar_label: 'Distributed Query'
4+
description: 'Learn how to run Spice in distributed mode for larger scale queries.'
5+
sidebar_position: 4
6+
pagination_prev: null
7+
pagination_next: null
8+
---
9+
10+
Learn how to configure and run Spice in distributed mode to handle larger scale queries across multiple nodes.
11+
12+
:::info Preview
13+
Multi-node distributed query execution based on Apache Ballista is available as a preview feature in Spice `v1.9.0`.
14+
:::
15+
16+
## Overview
17+
18+
Spice integrates [Apache Ballista](https://github.com/apache/datafusion-ballista) to schedule and coordinate distributed queries across multiple executor nodes. This integration enables distributed execution when running large queries over partitioned data lake formats such as Parquet, Delta Lake, or Iceberg.
19+
20+
## Architecture
21+
22+
A distributed Spice cluster consists of two components:
23+
24+
- **Scheduler** – Plans distributed queries and manages the work queue for the executor fleet. Single instance per cluster.
25+
- **Executors** – One or more nodes responsible for executing physical query plans.
26+
27+
The scheduler holds the cluster-wide configuration for a Spicepod, while executors connect to the scheduler to receive work.
28+
29+
## Getting Started
30+
31+
Cluster deployment typically starts with a scheduler instance, followed by one or more executors that register with the scheduler.
32+
33+
### Start the Scheduler
34+
35+
The scheduler is the only `spiced` process that needs to be configured (i.e. have a `spicepod.yaml` in the current dir). Override the Flight bind address when it must be reachable outside of `localhost`:
36+
37+
```bash
38+
# Start scheduler
39+
spiced --cluster-mode scheduler --flight 0.0.0.0:50051
40+
```
41+
42+
### Start Executors
43+
44+
Executors need the scheduler's Flight URI to register and pull work. The executors do not require a `spicepod.yaml` to be present, it will fetch the configuration from the coordinator. Each executor automatically selects a free port if the default is unavailable:
45+
46+
```bash
47+
# Start executor
48+
spiced --cluster-mode executor --scheduler-url spiced://localhost:50051
49+
```
50+
51+
## Query Execution
52+
53+
Queries run against the scheduler endpoint. The `EXPLAIN` output confirms that distributed planning is active—Spice includes a `distributed_plan` section showing how stages are split across executors:
54+
55+
```sql
56+
EXPLAIN SELECT count(id) FROM my_dataset;
57+
```
58+
59+
:::warning[Limitations]
60+
61+
- Accelerated datasets are not yet supported; distributed query currently targets partitioned data lake sources.
62+
- As a preview feature, clusters may encounter stability or performance issues.
63+
- Accelerator support is planned for future releases; follow release notes for updates.
64+
65+
:::

website/docs/features/embeddings/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Embedding Datasets'
33
sidebar_label: 'Embedding Datasets'
44
description: 'Learn how to define, or augment existing datasets with embedding column(s).'
5-
sidebar_position: 7
5+
sidebar_position: 9
66
pagination_prev: null
77
pagination_next: null
88
---

website/docs/features/large-language-models/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Large Language Models'
33
sidebar_label: 'Large Language Models'
44
description: 'Learn how to configure large language models (LLMs)'
5-
sidebar_position: 5
5+
sidebar_position: 7
66
pagination_prev: null
77
pagination_next: null
88
tags:

website/docs/features/machine-learning-models/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: 'Machine Learning Models'
33
sidebar_label: 'Machine Learning Models'
4-
sidebar_position: 6
4+
sidebar_position: 8
55
pagination_prev: null
66
pagination_next: null
77
tags:

website/docs/features/observability/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Observability & Monitoring'
33
sidebar_label: 'Observability'
44
description: 'Learn how to use Spice telemetry.'
5-
sidebar_position: 10
5+
sidebar_position: 12
66
pagination_prev: null
77
pagination_next: null
88
---

website/docs/features/search/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Search Functionality'
33
sidebar_label: 'Search'
44
description: 'Learn how Spice can search across datasets using database-native and vector-search methods.'
5-
sidebar_position: 8
5+
sidebar_position: 10
66
pagination_prev: null
77
pagination_next: null
88
tags:

website/docs/features/semantic-model/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: 'Semantic Model'
33
sidebar_label: 'Semantic Model'
44
description: 'Learn how to define and use semantic data models with Spice.'
5-
sidebar_position: 9
5+
sidebar_position: 11
66
pagination_prev: null
77
pagination_next: null
88
---

0 commit comments

Comments
 (0)