Skip to content

Commit c8c1df3

Browse files
authored
Merge branch 'release/1.2.0' into lukim/parameterized-queries
2 parents de1e8b4 + 8ffdd24 commit c8c1df3

10 files changed

Lines changed: 310 additions & 44 deletions

File tree

website/docs/components/data-connectors/databricks.md

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -72,12 +72,13 @@ Configure the connection to the object store when using `mode: delta_lake`. Use
7272

7373
### AWS S3
7474

75-
| Parameter Name | Description |
76-
| ---------------------------------- | ---------------------------------------------------------------------------------- |
77-
| `databricks_aws_region` | Optional. The AWS region for the S3 object store. E.g. `us-west-2`. |
78-
| `databricks_aws_access_key_id` | The access key ID for the S3 object store. |
79-
| `databricks_aws_secret_access_key` | The secret access key for the S3 object store. |
80-
| `databricks_aws_endpoint` | Optional. The endpoint for the S3 object store. E.g. `s3.us-west-2.amazonaws.com`. |
75+
| Parameter Name | Description |
76+
| ---------------------------------- | ---------------------------------------------------------------------------------------------- |
77+
| `databricks_aws_region` | Optional. The AWS region for the S3 object store. E.g. `us-west-2`. |
78+
| `databricks_aws_access_key_id` | The access key ID for the S3 object store. |
79+
| `databricks_aws_secret_access_key` | The secret access key for the S3 object store. |
80+
| `databricks_aws_endpoint` | Optional. The endpoint for the S3 object store. E.g. `s3.us-west-2.amazonaws.com`. |
81+
| `databricks_aws_allow_http` | Optional. Enables insecure HTTP connections to `databricks_aws_endpoint`. Defaults to `false`. |
8182

8283
### Azure Blob
8384

@@ -208,15 +209,15 @@ Spice integrates with multiple secret stores to help manage sensitive data secur
208209

209210
- When using `mode: spark_connect`, correlated scalar subqueries can only be used in filters, aggregations, projections, and UPDATE/MERGE/DELETE commands. [Spark Docs](https://spark.apache.org/docs/latest/sql-error-conditions-unsupported-subquery-expression-category-error-class.html#unsupported_correlated_scalar_subquery)
210211

211-
:::warning[Memory Considerations]
212+
:::warning[Memory Considerations]
212213

213-
When using the Databricks (mode: delta_lake) Data connector without acceleration, data is loaded into memory during query execution. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.
214+
When using the Databricks (mode: delta_lake) Data connector without acceleration, data is loaded into memory during query execution. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.
214215

215-
Memory limitations can be mitigated by storing acceleration data on disk, which is supported by [`duckdb`](../data-accelerators/duckdb.md) and [`sqlite`](../data-accelerators/sqlite.md) accelerators by specifying `mode: file`.
216+
Memory limitations can be mitigated by storing acceleration data on disk, which is supported by [`duckdb`](../data-accelerators/duckdb.md) and [`sqlite`](../data-accelerators/sqlite.md) accelerators by specifying `mode: file`.
216217

217218
- The Databricks Connector (`mode: spark_connect`) does not yet support streaming query results from Spark.
218219

219-
:::
220+
:::
220221

221222
## Cookbook
222223

website/docs/components/data-connectors/delta-lake.md

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -63,12 +63,13 @@ Use the [secret replacement syntax](../secret-stores/index.md) to reference a se
6363

6464
### AWS S3
6565

66-
| Parameter Name | Description |
67-
| ---------------------------------- | ---------------------------------------------------------------------------------- |
68-
| `delta_lake_aws_region` | Optional. The AWS region for the S3 object store. E.g. `us-west-2`. |
69-
| `delta_lake_aws_access_key_id` | The access key ID for the S3 object store. |
70-
| `delta_lake_aws_secret_access_key` | The secret access key for the S3 object store. |
71-
| `delta_lake_aws_endpoint` | Optional. The endpoint for the S3 object store. E.g. `s3.us-west-2.amazonaws.com`. |
66+
| Parameter Name | Description |
67+
| ---------------------------------- | ---------------------------------------------------------------------------------------------- |
68+
| `delta_lake_aws_region` | Optional. The AWS region for the S3 object store. E.g. `us-west-2`. |
69+
| `delta_lake_aws_access_key_id` | The access key ID for the S3 object store. |
70+
| `delta_lake_aws_secret_access_key` | The secret access key for the S3 object store. |
71+
| `delta_lake_aws_endpoint` | Optional. The endpoint for the S3 object store. E.g. `s3.us-west-2.amazonaws.com`. |
72+
| `delta_lake_aws_allow_http` | Optional. Enables insecure HTTP connections to `delta_lake_aws_endpoint`. Defaults to `false`. |
7273

7374
### Azure Blob
7475

@@ -116,6 +117,19 @@ Use the [secret replacement syntax](../secret-stores/index.md) to reference a se
116117
delta_lake_aws_endpoint: s3.us-west-2.amazonaws.com # Optional
117118
```
118119

120+
### Delta Lake + MinIO
121+
122+
```yaml
123+
- from: delta_lake:s3://my_bucket/path/to/s3/delta/table/ # A reference to a table in MinIO
124+
name: my_delta_lake_table
125+
params:
126+
delta_lake_aws_region: us-east-1 # Best practice for MinIO
127+
delta_lake_aws_access_key_id: ${secrets:aws_access_key_id}
128+
delta_lake_aws_secret_access_key: ${secrets:aws_secret_access_key}
129+
delta_lake_aws_endpoint: http://localhost:9000 # MinIO Endpoint
130+
delta_lake_aws_allow_http: true
131+
```
132+
119133
### Delta Lake + Azure Blob
120134

121135
```yaml
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
title: 'Databricks Model Provider'
3+
description: 'Instructions for using Databricks Mosaic AI Models'
4+
sidebar_label: 'Databricks'
5+
sidebar_position: 8
6+
---
7+
8+
To use a language model deployed to [Databricks Mosaic AI Model Serving](https://docs.databricks.com/aws/en/machine-learning/model-serving/), specify the model endpoint name prefixed with `databricks:` in the `from` field and include the required parameters in the `params` section.
9+
10+
### Parameters
11+
12+
| Parameter | Description |
13+
| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
14+
| `databricks_endpoint` | The Databricks workspace endpoint, e.g., `dbc-a12cd3e4-56f7.cloud.databricks.com`. |
15+
| `databricks_token` | The Databricks API token to authenticate with the Unity Catalog API. Use the [secret replacement syntax](../secret-stores/index.md) to reference a secret, e.g., `${secrets:my_databricks_token}`. |
16+
17+
### Example `spicepod.yaml` Configuration
18+
19+
```yaml
20+
models:
21+
- from: databricks:jeadie
22+
name: food
23+
params:
24+
databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com
25+
databricks_token: ${ secrets:SPICE_DATABRICKS_TOKEN }
26+
```
27+
28+
### Additional Information
29+
30+
Refer to the [Moasic AI Model Serving documentation](https://docs.databricks.com/aws/en/machine-learning/model-serving/) for more details on available models and configurations.

website/docs/components/models/index.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,16 @@ image: /img/og/models.png
77

88
Spice supports various model providers for traditional machine learning (ML) models and large language models (LLMs).
99

10-
| Name | Description | Status | ML Format(s) | LLM Format(s)\* |
11-
| ------------------- | -------------------------------------------- | ----------------- | ------------ | ------------------------------- |
12-
| [`openai`][openai] | OpenAI (or compatible) LLM endpoint | Release Candidate | - | OpenAI-compatible HTTP endpoint |
13-
| [`file`][file] | Local filesystem | Release Candidate | ONNX | GGUF, GGML, SafeTensor |
14-
| [`huggingface`][hf] | Models hosted on HuggingFace | Release Candidate | ONNX | GGUF, GGML, SafeTensor |
15-
| [`spice.ai`][spice] | Models hosted on the Spice.ai Cloud Platform | Alpha | ONNX | OpenAI-compatible HTTP endpoint |
16-
| [`azure`][azure] | Azure OpenAI | Alpha | - | OpenAI-compatible HTTP endpoint |
17-
| [`anthropic`][ant] | Models hosted on Anthropic | Alpha | - | OpenAI-compatible HTTP endpoint |
18-
| [`xai`][xai] | Models hosted on xAI | Alpha | - | OpenAI-compatible HTTP endpoint |
10+
| Name | Description | Status | ML Format(s) | LLM Format(s)\* |
11+
| -------------------------- | -------------------------------------------- | ----------------- | ------------ | ------------------------------- |
12+
| [`openai`][openai] | OpenAI (or compatible) LLM endpoint | Release Candidate | - | OpenAI-compatible HTTP endpoint |
13+
| [`file`][file] | Local filesystem | Release Candidate | ONNX | GGUF, GGML, SafeTensor |
14+
| [`huggingface`][hf] | Models hosted on HuggingFace | Release Candidate | ONNX | GGUF, GGML, SafeTensor |
15+
| [`spice.ai`][spice] | Models hosted on the Spice.ai Cloud Platform | Alpha | ONNX | OpenAI-compatible HTTP endpoint |
16+
| [`azure`][azure] | Azure OpenAI | Alpha | - | OpenAI-compatible HTTP endpoint |
17+
| [`anthropic`][ant] | Models hosted on Anthropic | Alpha | - | OpenAI-compatible HTTP endpoint |
18+
| [`xai`][xai] | Models hosted on xAI | Alpha | - | OpenAI-compatible HTTP endpoint |
19+
| [`databricks`][databricks] | Models deployed to Databricks Mosaic AI | Alpha | - | OpenAI-compatible HTTP endpoint |
1920

2021
[file]: /components/embeddings/local.md
2122
[hf]: ./huggingface.md
@@ -24,6 +25,7 @@ Spice supports various model providers for traditional machine learning (ML) mod
2425
[azure]: ./azure.md
2526
[ant]: ./anthropic.md
2627
[xai]: ./xai.md
28+
[databricks]: ./databricks.md
2729

2830
Spice also tests and evaluates common models and grades their ability to integrate with Spice. See the [Models Grade Report](/docs/reference/models.md).
2931

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
title: 'Workers Overview'
3+
description: 'Detailed documentation for workers in the Spice runtime.'
4+
sidebar_label: 'Workers Overview'
5+
sidebar_position: 8
6+
---
7+
8+
Workers in the Spice runtime represent configurable units of compute that help coordinate and manage interactions between models and tools. Each worker is defined as a component in the `spicepod.yaml` file, specifying its behavior and interaction logic.
9+
10+
## Configuration
11+
12+
Workers are configured in the `workers` section of the `spicepod.yaml` file. Each worker definition includes a name, description, and a list of models or tools it encapsulates.
13+
14+
**Example `spicepod.yaml` configuration:**
15+
16+
```yaml
17+
workers:
18+
- name: round-robin
19+
description: |
20+
Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
21+
models:
22+
- from: foo
23+
- from: bar
24+
- name: fallback
25+
description: |
26+
Attempts 'bar' first, then 'foo', then 'baz' if previous models fail.
27+
models:
28+
- from: foo
29+
order: 2
30+
- from: bar
31+
order: 1
32+
- from: baz
33+
order: 3
34+
```
35+
36+
## Use-Cases
37+
38+
Workers currently help implement:
39+
40+
- Model fallback and error handling
41+
- Load balancing across multiple models
42+
43+
## Usage
44+
45+
Workers can be invoked using the same API endpoints as individual models. For example, to call a worker named `fallback` using the OpenAI-compatible HTTP API:
46+
47+
```bash
48+
curl http://localhost:8090/v1/chat/completions \
49+
-H "Content-Type: application/json" \
50+
-d '{
51+
"model": "fallback",
52+
"messages": [{ "role": "user", "content": "Tell me a joke"}]
53+
}'
54+
```
55+
56+
## Roadmap
57+
58+
The vision for workers includes support for dynamic serverless compute, enabling execution of user-defined functions within the Spice runtime. This direction aims to help developers define custom logic and orchestration patterns directly in the worker configuration, supporting more advanced workflows and automation. Further details and implementation timelines will be provided in future updates. For ongoing progress, refer to the project repository and documentation.
59+
60+
## Further Reading
61+
62+
For a complete specification of worker configuration, routing rules, and available options, refer to the [Spicepod Workers Reference](/docs/reference/spicepod/workers.md).

website/docs/reference/memory.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,16 @@ Refresh modes affect memory usage as follows:
3535

3636
Spice.ai uses DataFusion as its query execution engine. By default, DataFusion does not enforce strict memory limits, which can lead to unbounded usage. Spice.ai addresses this through:
3737

38+
- **Memory Limit**: The `runtime.memory_limit` parameter defines the maximum memory available for query execution. Once the memory limit is reached, supported query operations spill data to disk, helping prevent out-of-memory errors and maintain query stability. See [Spicepod Configuration](spicepod/index.md#memory-limit) for details.
3839
- **Memory Budgeting**: Limits memory per query execution. Queries exceeding the limit return an error. See [Spicepod Configuration](spicepod/index.md) for details.
3940
- **Spill-to-Disk**: Operators such as Sort, Join, and GroupByHash spill intermediate results to disk when memory limits are exceeded, preventing out-of-memory errors.
4041

42+
DataFusion supports spilling for several operators, but not all operations are currently supported. Notably, the following operations do not support spilling:
43+
44+
- HashJoin ([tracking issue](https://github.com/apache/arrow-datafusion/issues/1047))
45+
- ExternalSorterMerge (no current tracking issue; previously discussed in the context of SortMergeJoin)
46+
- RepartitionMerge (spilling is suggested to be supported, but may depend on HashJoin support; see [issue](https://github.com/apache/arrow-datafusion/issues/1047))
47+
4148
## Embedded Data Accelerators
4249

4350
Spice.ai integrates with embedded accelerators like [SQLite](/docs/components/data-accelerators/sqlite.md) and [DuckDB](/docs/components/data-accelerators/duckdb.md), each with unique memory considerations:

website/docs/reference/spicepod/index.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,3 +208,31 @@ views:
208208
ORDER BY count DESC
209209
LIMIT 5
210210
```
211+
212+
## `workers`
213+
214+
A Spicepod can contain one or more [workers](./workers.md) defining configurable units of compute.
215+
216+
**Example**
217+
218+
```yaml
219+
workers:
220+
- name: round-robin
221+
description: |
222+
Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
223+
models:
224+
- from: foo
225+
- from: bar
226+
- name: fallback
227+
description: |
228+
Attempts 'bar' first, then 'foo', then 'baz' if previous models fail.
229+
models:
230+
- from: foo
231+
order: 2
232+
- from: bar
233+
order: 1
234+
- from: baz
235+
order: 3
236+
```
237+
238+
For a complete specification of worker configuration, see the [Workers Reference](/docs/reference/spicepod/workers.md).

website/docs/reference/spicepod/runtime.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,19 @@ runtime:
169169

170170
This configuration permits requests only from the `https://example.com` origin.
171171

172+
## `runtime.memory_limit`
173+
174+
The `memory_limit` parameter sets a memory usage cap for the Spice runtime query engine. This limit applies **only** to the query engine and should be used in addition to other memory configuration options, such as `duckdb_memory_limit`. When `memory_limit` is specified, the value of `runtime.temp_directory` determines the directory DataFusion uses for spilling intermediate data to disk.
175+
176+
```yaml
177+
runtime:
178+
memory_limit: 4GiB
179+
```
180+
181+
Specify the value as a size, for example `4GiB` or `1024MiB`.
182+
183+
For detailed memory information, see [Memory](/docs/reference/memory.md).
184+
172185
## `runtime.temp_directory`
173186

174187
The path to a temporary directory that Spice uses for query and acceleration operations that spill to disk. For more details, see the [Managing Memory Usage documentation](../memory.md) and the [DuckDB Data Accelerator documentation](../../components/data-accelerators/duckdb.md).
Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
---
2+
title: 'Workers'
3+
sidebar_label: 'Workers'
4+
description: 'Workers YAML reference'
5+
---
6+
7+
Workers in the Spice runtime represent configurable units of compute that help coordinate and manage interactions between models and tools. Currently, workers define how one or more [llms](../models.md) can be combined into a logically single model.
8+
9+
## `workers`
10+
11+
The `workers` section in your configuration specifies one or more workers.
12+
13+
Example:
14+
15+
```yaml
16+
workers:
17+
- name: round-robin
18+
description: |
19+
Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
20+
models:
21+
- from: foo
22+
- from: bar
23+
- name: fallback
24+
description: |
25+
Attempts 'bar' first, then 'foo', then 'baz' if previous models fail.
26+
models:
27+
- from: foo
28+
order: 2
29+
- from: bar
30+
order: 1
31+
- from: baz
32+
order: 3
33+
- name: weighted
34+
description: |
35+
Routes 80% of traffic to 'foo'.
36+
models:
37+
- from: foo
38+
order: 4
39+
- from: bar
40+
order: 1
41+
```
42+
43+
### `name`
44+
45+
A unique identifier for this worker component.
46+
47+
### `description`
48+
49+
Additional details about the worker, useful for displaying to users and providing to LLM context.
50+
51+
### `models` {#models}
52+
53+
A list of model configurations that define how the model worker behaves.
54+
55+
The elements' structure uniquely determine the model worker algorithm. List elements should be of consistent type.
56+
57+
| Key name | Key type | Description |
58+
| -------- | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
59+
| from | String | The `model.name` of a defined `model` spicepod component. |
60+
| order | Integer, positive | The priority of the model in order. The lowest value is used first, followed by increasing order. The ordering of models with equal `order` is undefined. |
61+
62+
#### Worker with round-robin routing across models
63+
64+
Example
65+
66+
```yaml
67+
workers:
68+
- name: round-robin
69+
description: |
70+
Call models 'foo' & 'bar' in round robin.
71+
models:
72+
- from: foo
73+
- from: bar
74+
```
75+
76+
The worker selects each model in turn for subsequent requests.
77+
78+
#### Worker with fallback model routing
79+
80+
Example
81+
82+
```yaml
83+
workers:
84+
- name: fallback
85+
description: |
86+
Call 'bar'. On error, call 'foo'. Failing that 'baz'.
87+
models:
88+
- from: foo
89+
order: 2
90+
- from: bar
91+
order: 1
92+
- from: baz
93+
order: 3
94+
```
95+
96+
The worker uses the models in increasing order, returning the first result that is not an error.

0 commit comments

Comments
 (0)