Merge branch 'release/1.2.0' into lukim/parameterized-queries

lukekim · web-flow · commit c8c1df333f34 · 2025-04-28T16:11:29.000-07:00
diff --git a/website/docs/components/data-connectors/databricks.md b/website/docs/components/data-connectors/databricks.md
@@ -72,12 +72,13 @@ Configure the connection to the object store when using `mode: delta_lake`. Use
 
 ### AWS S3
 
-| Parameter Name                     | Description                                                                        |
-| ---------------------------------- | ---------------------------------------------------------------------------------- |
-| `databricks_aws_region`            | Optional. The AWS region for the S3 object store. E.g. `us-west-2`.                |
-| `databricks_aws_access_key_id`     | The access key ID for the S3 object store.                                         |
-| `databricks_aws_secret_access_key` | The secret access key for the S3 object store.                                     |
-| `databricks_aws_endpoint`          | Optional. The endpoint for the S3 object store. E.g. `s3.us-west-2.amazonaws.com`. |
+| Parameter Name                     | Description                                                                                    |
+| ---------------------------------- | ---------------------------------------------------------------------------------------------- |
+| `databricks_aws_region`            | Optional. The AWS region for the S3 object store. E.g. `us-west-2`.                            |
+| `databricks_aws_access_key_id`     | The access key ID for the S3 object store.                                                     |
+| `databricks_aws_secret_access_key` | The secret access key for the S3 object store.                                                 |
+| `databricks_aws_endpoint`          | Optional. The endpoint for the S3 object store. E.g. `s3.us-west-2.amazonaws.com`.             |
+| `databricks_aws_allow_http`        | Optional. Enables insecure HTTP connections to `databricks_aws_endpoint`. Defaults to `false`. |
 
 ### Azure Blob
 
@@ -208,15 +209,15 @@ Spice integrates with multiple secret stores to help manage sensitive data secur
 
 - When using `mode: spark_connect`, correlated scalar subqueries can only be used in filters, aggregations, projections, and UPDATE/MERGE/DELETE commands. [Spark Docs](https://spark.apache.org/docs/latest/sql-error-conditions-unsupported-subquery-expression-category-error-class.html#unsupported_correlated_scalar_subquery)
 
- :::warning[Memory Considerations]
+:::warning[Memory Considerations]
 
- When using the Databricks (mode: delta_lake) Data connector without acceleration, data is loaded into memory during query execution. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.
+When using the Databricks (mode: delta_lake) Data connector without acceleration, data is loaded into memory during query execution. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.
 
- Memory limitations can be mitigated by storing acceleration data on disk, which is supported by [`duckdb`](../data-accelerators/duckdb.md) and [`sqlite`](../data-accelerators/sqlite.md) accelerators by specifying `mode: file`.
+Memory limitations can be mitigated by storing acceleration data on disk, which is supported by [`duckdb`](../data-accelerators/duckdb.md) and [`sqlite`](../data-accelerators/sqlite.md) accelerators by specifying `mode: file`.
 
 - The Databricks Connector (`mode: spark_connect`) does not yet support streaming query results from Spark.
 
- :::
+:::
 
 ## Cookbook
 
diff --git a/website/docs/components/data-connectors/delta-lake.md b/website/docs/components/data-connectors/delta-lake.md
@@ -63,12 +63,13 @@ Use the [secret replacement syntax](../secret-stores/index.md) to reference a se
 
 ### AWS S3
 
-| Parameter Name                     | Description                                                                        |
-| ---------------------------------- | ---------------------------------------------------------------------------------- |
-| `delta_lake_aws_region`            | Optional. The AWS region for the S3 object store. E.g. `us-west-2`.                |
-| `delta_lake_aws_access_key_id`     | The access key ID for the S3 object store.                                         |
-| `delta_lake_aws_secret_access_key` | The secret access key for the S3 object store.                                     |
-| `delta_lake_aws_endpoint`          | Optional. The endpoint for the S3 object store. E.g. `s3.us-west-2.amazonaws.com`. |
+| Parameter Name                     | Description                                                                                    |
+| ---------------------------------- | ---------------------------------------------------------------------------------------------- |
+| `delta_lake_aws_region`            | Optional. The AWS region for the S3 object store. E.g. `us-west-2`.                            |
+| `delta_lake_aws_access_key_id`     | The access key ID for the S3 object store.                                                     |
+| `delta_lake_aws_secret_access_key` | The secret access key for the S3 object store.                                                 |
+| `delta_lake_aws_endpoint`          | Optional. The endpoint for the S3 object store. E.g. `s3.us-west-2.amazonaws.com`.             |
+| `delta_lake_aws_allow_http`        | Optional. Enables insecure HTTP connections to `delta_lake_aws_endpoint`. Defaults to `false`. |
 
 ### Azure Blob
 
@@ -116,6 +117,19 @@ Use the [secret replacement syntax](../secret-stores/index.md) to reference a se
     delta_lake_aws_endpoint: s3.us-west-2.amazonaws.com # Optional
 ```
 
+### Delta Lake + MinIO
+
+```yaml
+- from: delta_lake:s3://my_bucket/path/to/s3/delta/table/ # A reference to a table in MinIO
+  name: my_delta_lake_table
+  params:
+    delta_lake_aws_region: us-east-1 # Best practice for MinIO
+    delta_lake_aws_access_key_id: ${secrets:aws_access_key_id}
+    delta_lake_aws_secret_access_key: ${secrets:aws_secret_access_key}
+    delta_lake_aws_endpoint: http://localhost:9000 # MinIO Endpoint
+    delta_lake_aws_allow_http: true
+```
+
 ### Delta Lake + Azure Blob
 
 ```yaml
diff --git a/website/docs/components/models/databricks.md b/website/docs/components/models/databricks.md
@@ -0,0 +1,30 @@
+---
+title: 'Databricks Model Provider'
+description: 'Instructions for using Databricks Mosaic AI Models'
+sidebar_label: 'Databricks'
+sidebar_position: 8
+---
+
+To use a language model deployed to [Databricks Mosaic AI Model Serving](https://docs.databricks.com/aws/en/machine-learning/model-serving/), specify the model endpoint name prefixed with `databricks:` in the `from` field and include the required parameters in the `params` section.
+
+### Parameters
+
+| Parameter             | Description                                                                                                                                                                                        |
+| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `databricks_endpoint` | The Databricks workspace endpoint, e.g., `dbc-a12cd3e4-56f7.cloud.databricks.com`.                                                                                                                 |
+| `databricks_token`    | The Databricks API token to authenticate with the Unity Catalog API. Use the [secret replacement syntax](../secret-stores/index.md) to reference a secret, e.g., `${secrets:my_databricks_token}`. |
+
+### Example `spicepod.yaml` Configuration
+
+```yaml
+models:
+  - from: databricks:jeadie
+    name: food
+    params:
+      databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com
+      databricks_token: ${ secrets:SPICE_DATABRICKS_TOKEN }
+```
+
+### Additional Information
+
+Refer to the [Moasic AI Model Serving documentation](https://docs.databricks.com/aws/en/machine-learning/model-serving/) for more details on available models and configurations.
diff --git a/website/docs/components/models/index.md b/website/docs/components/models/index.md
@@ -7,15 +7,16 @@ image: /img/og/models.png
 
 Spice supports various model providers for traditional machine learning (ML) models and large language models (LLMs).
 
-| Name                | Description                                  | Status            | ML Format(s) | LLM Format(s)\*                 |
-| ------------------- | -------------------------------------------- | ----------------- | ------------ | ------------------------------- |
-| [`openai`][openai]  | OpenAI (or compatible) LLM endpoint          | Release Candidate | -            | OpenAI-compatible HTTP endpoint |
-| [`file`][file]      | Local filesystem                             | Release Candidate | ONNX         | GGUF, GGML, SafeTensor          |
-| [`huggingface`][hf] | Models hosted on HuggingFace                 | Release Candidate | ONNX         | GGUF, GGML, SafeTensor          |
-| [`spice.ai`][spice] | Models hosted on the Spice.ai Cloud Platform | Alpha             | ONNX         | OpenAI-compatible HTTP endpoint |
-| [`azure`][azure]    | Azure OpenAI                                 | Alpha             | -            | OpenAI-compatible HTTP endpoint |
-| [`anthropic`][ant]  | Models hosted on Anthropic                   | Alpha             | -            | OpenAI-compatible HTTP endpoint |
-| [`xai`][xai]        | Models hosted on xAI                         | Alpha             | -            | OpenAI-compatible HTTP endpoint |
+| Name                       | Description                                  | Status            | ML Format(s) | LLM Format(s)\*                 |
+| -------------------------- | -------------------------------------------- | ----------------- | ------------ | ------------------------------- |
+| [`openai`][openai]         | OpenAI (or compatible) LLM endpoint          | Release Candidate | -            | OpenAI-compatible HTTP endpoint |
+| [`file`][file]             | Local filesystem                             | Release Candidate | ONNX         | GGUF, GGML, SafeTensor          |
+| [`huggingface`][hf]        | Models hosted on HuggingFace                 | Release Candidate | ONNX         | GGUF, GGML, SafeTensor          |
+| [`spice.ai`][spice]        | Models hosted on the Spice.ai Cloud Platform | Alpha             | ONNX         | OpenAI-compatible HTTP endpoint |
+| [`azure`][azure]           | Azure OpenAI                                 | Alpha             | -            | OpenAI-compatible HTTP endpoint |
+| [`anthropic`][ant]         | Models hosted on Anthropic                   | Alpha             | -            | OpenAI-compatible HTTP endpoint |
+| [`xai`][xai]               | Models hosted on xAI                         | Alpha             | -            | OpenAI-compatible HTTP endpoint |
+| [`databricks`][databricks] | Models deployed to Databricks Mosaic AI      | Alpha             | -            | OpenAI-compatible HTTP endpoint |
 
 [file]: /components/embeddings/local.md
 [hf]: ./huggingface.md
@@ -24,6 +25,7 @@ Spice supports various model providers for traditional machine learning (ML) mod
 [azure]: ./azure.md
 [ant]: ./anthropic.md
 [xai]: ./xai.md
+[databricks]: ./databricks.md
 
 Spice also tests and evaluates common models and grades their ability to integrate with Spice. See the [Models Grade Report](/docs/reference/models.md).
 
diff --git a/website/docs/components/workers/index.md b/website/docs/components/workers/index.md
@@ -0,0 +1,62 @@
+---
+title: 'Workers Overview'
+description: 'Detailed documentation for workers in the Spice runtime.'
+sidebar_label: 'Workers Overview'
+sidebar_position: 8
+---
+
+Workers in the Spice runtime represent configurable units of compute that help coordinate and manage interactions between models and tools. Each worker is defined as a component in the `spicepod.yaml` file, specifying its behavior and interaction logic.
+
+## Configuration
+
+Workers are configured in the `workers` section of the `spicepod.yaml` file. Each worker definition includes a name, description, and a list of models or tools it encapsulates.
+
+**Example `spicepod.yaml` configuration:**
+
+```yaml
+workers:
+  - name: round-robin
+    description: |
+      Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
+    models:
+      - from: foo
+      - from: bar
+  - name: fallback
+    description: |
+      Attempts 'bar' first, then 'foo', then 'baz' if previous models fail.
+    models:
+      - from: foo
+        order: 2
+      - from: bar
+        order: 1
+      - from: baz
+        order: 3
+```
+
+## Use-Cases
+
+Workers currently help implement:
+
+- Model fallback and error handling
+- Load balancing across multiple models
+
+## Usage
+
+Workers can be invoked using the same API endpoints as individual models. For example, to call a worker named `fallback` using the OpenAI-compatible HTTP API:
+
+```bash
+curl http://localhost:8090/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "fallback",
+    "messages": [{ "role": "user", "content": "Tell me a joke"}]
+  }'
+```
+
+## Roadmap
+
+The vision for workers includes support for dynamic serverless compute, enabling execution of user-defined functions within the Spice runtime. This direction aims to help developers define custom logic and orchestration patterns directly in the worker configuration, supporting more advanced workflows and automation. Further details and implementation timelines will be provided in future updates. For ongoing progress, refer to the project repository and documentation.
+
+## Further Reading
+
+For a complete specification of worker configuration, routing rules, and available options, refer to the [Spicepod Workers Reference](/docs/reference/spicepod/workers.md).
diff --git a/website/docs/reference/memory.md b/website/docs/reference/memory.md
@@ -35,9 +35,16 @@ Refresh modes affect memory usage as follows:
 
 Spice.ai uses DataFusion as its query execution engine. By default, DataFusion does not enforce strict memory limits, which can lead to unbounded usage. Spice.ai addresses this through:
 
+- **Memory Limit**: The `runtime.memory_limit` parameter defines the maximum memory available for query execution. Once the memory limit is reached, supported query operations spill data to disk, helping prevent out-of-memory errors and maintain query stability. See [Spicepod Configuration](spicepod/index.md#memory-limit) for details.
 - **Memory Budgeting**: Limits memory per query execution. Queries exceeding the limit return an error. See [Spicepod Configuration](spicepod/index.md) for details.
 - **Spill-to-Disk**: Operators such as Sort, Join, and GroupByHash spill intermediate results to disk when memory limits are exceeded, preventing out-of-memory errors.
 
+DataFusion supports spilling for several operators, but not all operations are currently supported. Notably, the following operations do not support spilling:
+
+- HashJoin ([tracking issue](https://github.com/apache/arrow-datafusion/issues/1047))
+- ExternalSorterMerge (no current tracking issue; previously discussed in the context of SortMergeJoin)
+- RepartitionMerge (spilling is suggested to be supported, but may depend on HashJoin support; see [issue](https://github.com/apache/arrow-datafusion/issues/1047))
+
 ## Embedded Data Accelerators
 
 Spice.ai integrates with embedded accelerators like [SQLite](/docs/components/data-accelerators/sqlite.md) and [DuckDB](/docs/components/data-accelerators/duckdb.md), each with unique memory considerations:
diff --git a/website/docs/reference/spicepod/index.md b/website/docs/reference/spicepod/index.md
@@ -208,3 +208,31 @@ views:
       ORDER BY count DESC
       LIMIT 5
 ```
+
+## `workers`
+
+A Spicepod can contain one or more [workers](./workers.md) defining configurable units of compute.
+
+**Example**
+
+```yaml
+workers:
+  - name: round-robin
+    description: |
+      Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
+    models:
+      - from: foo
+      - from: bar
+  - name: fallback
+    description: |
+      Attempts 'bar' first, then 'foo', then 'baz' if previous models fail.
+    models:
+      - from: foo
+        order: 2
+      - from: bar
+        order: 1
+      - from: baz
+        order: 3
+```
+
+For a complete specification of worker configuration, see the [Workers Reference](/docs/reference/spicepod/workers.md).
diff --git a/website/docs/reference/spicepod/runtime.md b/website/docs/reference/spicepod/runtime.md
@@ -169,6 +169,19 @@ runtime:
 
 This configuration permits requests only from the `https://example.com` origin.
 
+## `runtime.memory_limit`
+
+The `memory_limit` parameter sets a memory usage cap for the Spice runtime query engine. This limit applies **only** to the query engine and should be used in addition to other memory configuration options, such as `duckdb_memory_limit`. When `memory_limit` is specified, the value of `runtime.temp_directory` determines the directory DataFusion uses for spilling intermediate data to disk.
+
+```yaml
+runtime:
+  memory_limit: 4GiB
+```
+
+Specify the value as a size, for example `4GiB` or `1024MiB`.
+
+For detailed memory information, see [Memory](/docs/reference/memory.md).
+
 ## `runtime.temp_directory`
 
 The path to a temporary directory that Spice uses for query and acceleration operations that spill to disk. For more details, see the [Managing Memory Usage documentation](../memory.md) and the [DuckDB Data Accelerator documentation](../../components/data-accelerators/duckdb.md).
diff --git a/website/docs/reference/spicepod/workers.md b/website/docs/reference/spicepod/workers.md
@@ -0,0 +1,96 @@
+---
+title: 'Workers'
+sidebar_label: 'Workers'
+description: 'Workers YAML reference'
+---
+
+Workers in the Spice runtime represent configurable units of compute that help coordinate and manage interactions between models and tools. Currently, workers define how one or more [llms](../models.md) can be combined into a logically single model.
+
+## `workers`
+
+The `workers` section in your configuration specifies one or more workers.
+
+Example:
+
+```yaml
+workers:
+  - name: round-robin
+    description: |
+      Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
+    models:
+      - from: foo
+      - from: bar
+  - name: fallback
+    description: |
+      Attempts 'bar' first, then 'foo', then 'baz' if previous models fail.
+    models:
+      - from: foo
+        order: 2
+      - from: bar
+        order: 1
+      - from: baz
+        order: 3
+  - name: weighted
+    description: |
+      Routes 80% of traffic to 'foo'.
+    models:
+      - from: foo
+        order: 4
+      - from: bar
+        order: 1
+```
+
+### `name`
+
+A unique identifier for this worker component.
+
+### `description`
+
+Additional details about the worker, useful for displaying to users and providing to LLM context.
+
+### `models` {#models}
+
+A list of model configurations that define how the model worker behaves.
+
+The elements' structure uniquely determine the model worker algorithm. List elements should be of consistent type.
+
+| Key name | Key type          | Description                                                                                                                                               |
+| -------- | ----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| from     | String            | The `model.name` of a defined `model` spicepod component.                                                                                                 |
+| order    | Integer, positive | The priority of the model in order. The lowest value is used first, followed by increasing order. The ordering of models with equal `order` is undefined. |
+
+#### Worker with round-robin routing across models
+
+Example
+
+```yaml
+workers:
+  - name: round-robin
+    description: |
+      Call models 'foo' & 'bar' in round robin.
+    models:
+      - from: foo
+      - from: bar
+```
+
+The worker selects each model in turn for subsequent requests.
+
+#### Worker with fallback model routing
+
+Example
+
+```yaml
+workers:
+  - name: fallback
+    description: |
+      Call 'bar'. On error, call 'foo'. Failing that 'baz'.
+    models:
+      - from: foo
+        order: 2
+      - from: bar
+        order: 1
+      - from: baz
+        order: 3
+```
+
+The worker uses the models in increasing order, returning the first result that is not an error.
diff --git a/website/src/pages/cookbook/_ignored.tsx b/website/src/pages/cookbook/_ignored.tsx