Skip to content

Commit fae84a9

Browse files
authored
Merge branch 'trunk' into lukim/scalar-functions
2 parents 750c35b + b081b34 commit fae84a9

21 files changed

Lines changed: 1178 additions & 407 deletions

File tree

.github/workflows/build_and_publish.yml

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,19 @@ jobs:
3939
working-directory: website
4040
run: npm run build
4141

42+
- name: Can Deploy?
43+
id: can_deploy
44+
run: |
45+
if [[ "${{ secrets.CLOUDFLARE_API_TOKEN }}" == "" || "${{ secrets.CLOUDFLARE_ACCOUNT_ID }}" == "" ]]; then
46+
echo "CLOUDFLARE_API_TOKEN or CLOUDFLARE_ACCOUNT_ID is not set. Skipping deployment."
47+
echo "can_deploy=false" >> $GITHUB_OUTPUT
48+
else
49+
echo "can_deploy=true" >> $GITHUB_OUTPUT
50+
fi
51+
4252
- name: Deploy
4353
id: deploy
54+
if: ${{ steps.can_deploy.outputs.can_deploy == 'true' }}
4455
uses: cloudflare/wrangler-action@v3
4556
with:
4657
workingDirectory: website
@@ -50,7 +61,7 @@ jobs:
5061
gitHubToken: ${{ secrets.GITHUB_TOKEN }}
5162

5263
- name: Add deploy comment to PR
53-
if: ${{ github.event_name == 'pull_request' }}
64+
if: ${{ github.event_name == 'pull_request' && steps.can_deploy.outputs.can_deploy == 'true' }}
5465
uses: actions/github-script@v7
5566
with:
5667
script: |

website/blog/2024/announcing-1.0-stable.mdx

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ Here are some examples of how Spice.ai OSS solves real problems for these teams.
108108

109109
A core requirement for many applications is consistently fast data access, with or without AI. Twilio uses Spice.ai OSS as a data acceleration framework or [Database CDN](https://materializedview.io/p/building-a-cdn-for-databases-spice-ai), staging data in object-storage that's accelerated with Spice for sub-second query to improve the reliability of critical services in its messaging pipelines. Before Spice, a database outage could result in a service outage.
110110

111-
<Quote name="Peter Janovsky" title="Software Architect" company="Twilio" imageUrl="https://avatars.githubusercontent.com/u/46338034?v=4">
111+
<Quote name="Peter Janovsky" title="Software Architect" company="Twilio" imageUrl="/img/blog/2024/announcing-1.0-stable/peter-janovsky.jpeg">
112112
"Spice opened the door to take these critical control-plane datasets and move them next to our services in the runtime path."
113113
</Quote>
114114

@@ -120,7 +120,7 @@ With Spice, Twilio has achieved:
120120

121121
- **Mission-Critical Reliability**: Reduced reliance on queries to databases by using Spice to accelerate data in-memory locally, with automatic failover to query data directly from S3, ensuring uninterrupted service even during database downtime.
122122

123-
<Quote name="David Blum" title="Principal Software Engineer" company="Twilio" imageUrl="https://media.licdn.com/dms/image/v2/C4E03AQFwYGa02RBgSQ/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1604159257728?e=1743033600&v=beta&t=ts80CKZlcSuBaTUaKkWvjNIFr1FwbewxUVFzynM5HVY">
123+
<Quote name="David Blum" title="Principal Software Engineer" company="Twilio" imageUrl="/img/blog/2024/announcing-1.0-stable/david-blum.jpeg">
124124
"With a simple drop in container, we are able to double our data redundancy by using Spice."
125125
</Quote>
126126

@@ -136,7 +136,7 @@ By adopting Spice.ai OSS, Twilio strengthened its infrastructure, ensuring relia
136136

137137
Barracuda uses Spice.ai OSS to modernize data access for their email archiving and audit log systems, solving two big problems: slow query performance and costly queries. Before Spice, customers experienced frustrating delays of up to two minutes when searching email archives, due to the data volume being queried.
138138

139-
<Quote name="David Stancu" title="Senior Principal Software Engineer" company="Barracuda" imageUrl="https://media.licdn.com/dms/image/v2/C4D03AQHaYpFkb8Ef7g/profile-displayphoto-shrink_400_400/profile-displayphoto-shrink_400_400/0/1544111921005?e=1743033600&v=beta&t=9SNh2gijp_uK1Yslo3VwVo1PZzG8GAR_IQC6EOrEYLM">
139+
<Quote name="David Stancu" title="Senior Principal Software Engineer" company="Barracuda" imageUrl="/img/blog/2024/announcing-1.0-stable/david-stancu.jpeg">
140140
"It's just a huge gain in responsiveness for the customer."
141141
</Quote>
142142

@@ -150,7 +150,7 @@ With Spice, Barracuda has achieved:
150150

151151
- **Significant Cost Reduction**: Replaced expensive Databricks Spark queries, significantly cutting expenses while improving performance.
152152

153-
<Quote name="Darin Douglass" title="Principal Software Engineer" company="Barracuda" imageUrl="https://media.licdn.com/dms/image/v2/D5603AQHHeajJta0mRQ/profile-displayphoto-shrink_800_800/profile-displayphoto-shrink_800_800/0/1670162582353?e=1743033600&v=beta&t=EobdjhLyYk7hetWXEtAEghE1qjccYlUmgysvgy8kdiw">
153+
<Quote name="Darin Douglass" title="Principal Software Engineer" company="Barracuda" imageUrl="/img/blog/2024/announcing-1.0-stable/darin-douglass.jpeg">
154154
It just kinda spins up and it just works, which is really nice.
155155
</Quote>
156156

@@ -164,7 +164,7 @@ With Spice, Barracuda has achieved:
164164

165165
NRC Health uses Spice.ai OSS to simplify and accelerate the development of data-grounded AI features, unifying data from multiple platforms including MySQL, SharePoint, and Salesforce, into secure, AI-ready data. Before Spice, scaling AI expertise across the organization to build complex RAG-based scenarios was a challenge.
166166

167-
<Quote name="Dustin Warner" title="Director of Software Engineering" company="NRC Health" imageUrl="https://media.licdn.com/dms/image/v2/D5603AQHWZwH7esT5IQ/profile-displayphoto-shrink_400_400/B56ZQZBMHgHQAg-/0/1735586530652?e=1743033600&v=beta&t=u1KHvUE_B-WJrpQp9VCDScXZ_faK9-FVruz1a06_OeU">
167+
<Quote name="Dustin Warner" title="Director of Software Engineering" company="NRC Health" imageUrl="/img/blog/2024/announcing-1.0-stable/dustin-warner.jpeg">
168168
"What I like the most about Spice, it's very easy to collect data from different data sources and I am able to chat with this data and do everything in one place."
169169
</Quote>
170170

@@ -174,7 +174,7 @@ With Spice OSS, NRC Health has achieved:
174174

175175
- **Accelerated Time-to-Market**: Centralized data integration and AI model serving an enterprise-ready service, accelerating time to market.
176176

177-
<Quote name="Taher Ahmed" title="Software Engineering Manager" company="NRC Health" imageUrl="https://media.licdn.com/dms/image/v2/C5603AQEc2C1-eknKQA/profile-displayphoto-shrink_400_400/profile-displayphoto-shrink_400_400/0/1653330557976?e=1743033600&v=beta&t=maGgXxLXrGp9VSErAQoxvVcwoMBK0-0neNoIIhOhUf0">
177+
<Quote name="Taher Ahmed" title="Software Engineering Manager" company="NRC Health" imageUrl="/img/blog/2024/announcing-1.0-stable/taher-ahmed.jpeg">
178178
"I explored AI, embeddings, search algorithms, and features with our own database. I read a lot about this, but it was so much easier to use Spice than doing it from scratch."
179179
</Quote>
180180

@@ -184,12 +184,12 @@ With Spice OSS, NRC Health has achieved:
184184

185185
When using tools like GitHub Copilot, developers often face the hassle of switching between multiple environments to get the data they need.
186186

187-
<div style={{display: 'flex', justifyContent: 'center', marginBottom: '15px'}}>
188-
<ReactPlayer
189-
controls
190-
url='https://www.youtube.com/watch?v=A0QdHVUKfAk'
191-
/>
192-
</div>
187+
<div style={{display: 'flex', justifyContent: 'center', marginBottom: '15px'}}>
188+
<ReactPlayer
189+
controls
190+
url='https://www.youtube.com/watch?v=A0QdHVUKfAk'
191+
/>
192+
</div>
193193

194194
The [Spice.ai for GitHub Copilot Extension](https://github.com/marketplace/spice-ai-for-github-copilot) built on Spice.ai OSS, gives developers the ability to connect data from external sources to Copilot, grounding Copilot in relevant data not generally available in GitHub, like test data stored in a development database.
195195

website/blog/releases/v1.2.2.md

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
---
2+
date: 2025-05-13
3+
title: 'Spice v1.2.2 (May 13, 2025)'
4+
type: blog
5+
authors: [jeadie]
6+
tags: [release,databricks,embeddings,acceleration,helm,mcp]
7+
---
8+
9+
Announcing the release of Spice v1.2.2! 🌟
10+
11+
Spice v1.2.2 introduces support for Databricks Mosaic AI model serving and embeddings, alongside the existing Databricks catalog and dataset integrations. It adds configurable service ports in the Helm chart and resolves several bugs to improve stability and performance.
12+
13+
## Highlights in v1.2.2
14+
15+
- **Databricks Model & Embedding Provider**: Spice integrates with [Databricks Model Serving](https://www.databricks.com/product/model-serving) for models and embeddings, enabling secure access via machine-to-machine (M2M) OAuth authentication with service principal credentials. The runtime automatically refreshes tokens using `databricks_client_id` and `databricks_client_secret`, ensuring uninterrupted operation. This feature supports Databricks-hosted large language models and embedding models.
16+
17+
```yaml
18+
models:
19+
- from: databricks:databricks-llama-4-maverick
20+
name: llama-4-maverick
21+
params:
22+
databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com
23+
databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}
24+
databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}
25+
26+
embeddings:
27+
- from: databricks:databricks-gte-large-en
28+
name: gte-large-en
29+
params:
30+
databricks_endpoint: dbc-42424242-4242.cloud.databricks.com
31+
databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}
32+
databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}
33+
```
34+
35+
For detailed setup instructions, refer to the [Databricks Model Provider documentation](https://spiceai.org/docs/components/models/databricks).
36+
37+
- **Configurable Helm Chart Service Ports**: The Helm chart now supports custom ports for flexible network configurations for deployments. Specify non-default ports in your Helm values file.
38+
39+
- **Resolved Issues**:
40+
41+
- **MCP Nested Tool Calling**: Fixed a bug preventing nested tool invocation when Spice operates as the MCP server federating to MCP clients.
42+
43+
- **Dataset Load Concurrency**: Corrected a failure to respect the `dataset_load_parallelism` setting during dataset loading.
44+
45+
- **Acceleration Hot-Reload**: Addressed an issue where changes to acceleration enable/disable settings were not detected during hot reload of Spicepod.yaml.
46+
47+
## Contributors
48+
49+
- [@peasee](https://github.com/peasee)
50+
- [@ewgenius](https://github.com/ewgenius)
51+
- [@phillipleblanc](https://github.com/phillipleblanc)
52+
- [@kczimm](https://github.com/kczimm)
53+
- [@Jeadie](https://github.com/Jeadie)
54+
- [@sgrebnov](https://github.com/sgrebnov)
55+
- [@Sevenannn](https://github.com/Sevenannn)
56+
57+
## Breaking Changes
58+
59+
No breaking changes.
60+
61+
## Cookbook Updates
62+
63+
Updated cookbooks:
64+
65+
- [Databricks Catalogs](https://github.com/spiceai/cookbook/tree/trunk/catalogs/databricks/README.md): Includes using Databricks Service Principal
66+
- [Databricks](https://github.com/spiceai/cookbook/tree/trunk/databricks/README.md): Includes using M2M auth
67+
- [Python ADBC](https://github.com/spiceai/cookbook/tree/trunk/clients/adbc/README.md): Adds a dataset to be queried over ADBC.
68+
69+
The [Spice Cookbook](https://spiceai.org/cookbook) now includes 68 recipes to help you get started with Spice quickly and easily.
70+
71+
## Upgrading
72+
73+
To upgrade to v1.2.2, use one of the following methods:
74+
75+
**CLI**:
76+
77+
```console
78+
spice upgrade
79+
```
80+
81+
**Homebrew**:
82+
83+
```console
84+
brew upgrade spiceai/spiceai/spice
85+
```
86+
87+
**Docker**:
88+
89+
Pull the `spiceai/spiceai:1.2.2` image:
90+
91+
```console
92+
docker pull spiceai/spiceai:1.2.2
93+
```
94+
95+
For available tags, see [DockerHub](https://hub.docker.com/r/spiceai/spiceai/tags).
96+
97+
**Helm**:
98+
99+
```console
100+
helm repo update
101+
helm upgrade spiceai spiceai/spiceai
102+
```
103+
104+
## What's Changed
105+
106+
## Dependencies
107+
108+
- No major dependency changes.
109+
110+
## Changelog
111+
112+
```text
113+
- Update spark-connect-rs to override user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5798
114+
- Merge pull request by @ewgenius in https://github.com/spiceai/spice/pull/5796
115+
- Pass the default user agent string to the Databricks Spark, Delta, and Unity clients by @ewgenius in https://github.com/spiceai/spice/pull/5717
116+
- bump to 1.2.2 by @Jeadie in https://github.com/spiceai/spice/pull/none
117+
- Helm chart: support for service ports overrides by @sgrebnov in https://github.com/spiceai/spice/pull/5774
118+
- Update spice cli login command with client-id and client-secret flags for Databricks by @ewgenius in https://github.com/spiceai/spice/pull/5788
119+
- Fix bug where setting Cache-Control: no-cache doesn't compute the cache key by @phillipleblanc in https://github.com/spiceai/spice/pull/5779
120+
- Update to datafusion-contrib/datafusion-table-providers#336 by @phillipleblanc in https://github.com/spiceai/spice/pull/5778
121+
- Lru cache: limit single cached record size to u32::MAX (4GB) by @sgrebnov in https://github.com/spiceai/spice/pull/5772
122+
- Fix LLMs calling nested MCP tools by @Jeadie in https://github.com/spiceai/spice/pull/5771
123+
- MySQL: Set the character_set_results/character_set_client/character_set_connection session variables on connection setup by @Sevenannn in https://github.com/spiceai/spice/pull/5770
124+
- Control the parallelism of acceleration refresh datasets with runtime.dataset_load_parallelism by @phillipleblanc in https://github.com/spiceai/spice/pull/5763
125+
- Fix Iceberg predicates not matching the Arrow type of columns read from parquet files by @phillipleblanc in https://github.com/spiceai/spice/pull/5761
126+
- fix: Use decimal_cmp for numerical BETWEEN in SQLite by @peasee in https://github.com/spiceai/spice/pull/5760
127+
- Support product name override in databricks user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5749
128+
- Databricks U2M Token Provider support by @ewgenius in https://github.com/spiceai/spice/pull/5747
129+
- Remove HTTP auth from LLM config and simplify Databricks models logic by using static headers by @Jeadie in https://github.com/spiceai/spice/pull/5742
130+
- clear plan cache when dataset updates by @kczimm in https://github.com/spiceai/spice/pull/5741
131+
- Support Databricks M2M auth in LLMs + Embeddings by @Jeadie in https://github.com/spiceai/spice/pull/5720
132+
- Retrieve Github App tokens in background; make TokenProvider not async by @Jeadie in https://github.com/spiceai/spice/pull/5718
133+
- Make 'token_providers' crate by @Jeadie in https://github.com/spiceai/spice/pull/5716
134+
- Databricks AI: Embedding models & LLM streaming by @Jeadie in https://github.com/spiceai/spice/pull/5715
135+
```
136+
137+
See the full list of changes at: [v1.2.1...v1.2.2](https://github.com/spiceai/spiceai/compare/v1.2.1...v1.2.2)

website/blog/tags.yml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -94,10 +94,10 @@ duckdb:
9494
label: 'duckdb'
9595
permalink: '/duckdb'
9696
description: 'DuckDB database topics and usage'
97-
embedding:
98-
label: 'embedding'
97+
embeddings:
98+
label: 'embeddings'
9999
permalink: '/embeddings'
100-
description: 'Embedding techniques and tools'
100+
description: 'Embeddings related topics and usage'
101101
evaluation:
102102
label: 'evaluation'
103103
permalink: '/evaluations'
@@ -114,6 +114,10 @@ gpu:
114114
label: 'gpu'
115115
permalink: '/gpu'
116116
description: 'Graphics Processing Unit related topics and usage'
117+
helm:
118+
label: 'helm'
119+
permalink: '/helm'
120+
description: 'Helm Chart related topics and usage'
117121
huggingface:
118122
label: 'huggingface'
119123
permalink: '/huggingface'

website/docs/components/catalogs/databricks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ The `params` field is used to configure the connection to the Databricks Unity C
6161

6262
### Personal access token
6363

64-
To Learn more about how to set up personal access tokens, see [Databricks PAT docs](https://docs.databricks.com/aws/en/dev-tools/auth/pat).
64+
To learn more about how to set up personal access tokens, see [Databricks PAT docs](https://docs.databricks.com/aws/en/dev-tools/auth/pat).
6565

6666
```yaml
6767
catalogs:

website/docs/components/data-connectors/databricks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ Use the [secret replacement syntax](../secret-stores/index.md) to reference a se
7373

7474
### Personal access token
7575

76-
To Learn more about how to set up personal access tokens, see [Databricks PAT docs](https://docs.databricks.com/aws/en/dev-tools/auth/pat).
76+
To learn more about how to set up personal access tokens, see [Databricks PAT docs](https://docs.databricks.com/aws/en/dev-tools/auth/pat).
7777

7878
```yaml
7979
datasets:
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
---
2+
title: 'Databricks Model Provider'
3+
description: 'Instructions for using Databricks Mosaic AI Models'
4+
sidebar_label: 'Databricks'
5+
sidebar_position: 8
6+
---
7+
8+
To use an embedding model deployed to [Databricks Mosaic AI Model Serving](https://docs.databricks.com/aws/en/machine-learning/model-serving/), specify the model endpoint name prefixed with `databricks:` in the `from` field and include the required parameters in the `params` section.
9+
10+
### Parameters
11+
12+
| Parameter | Description |
13+
| -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
14+
| `databricks_endpoint` | The Databricks workspace endpoint, e.g., `dbc-a12cd3e4-56f7.cloud.databricks.com`. |
15+
| `databricks_token` | The Databricks API token to authenticate with the Databricks Models API. Use the [secret replacement syntax](../secret-stores/index.md) to reference a secret, e.g., `${secrets:my_databricks_token}`. |
16+
| `databricks_client_id` | The Databricks Service Principal Client ID. Can't be used with `databricks_token`. |
17+
| `databricks_client_secret` | The Databricks Service Principal Client Secret. Can't be used with `databricks_token`. |
18+
19+
### Example `spicepod.yaml` configuration, using personal access token
20+
21+
To learn more about how to set up personal access tokens, see [Databricks PAT docs](https://docs.databricks.com/aws/en/dev-tools/auth/pat).
22+
23+
```yaml
24+
embeddings:
25+
- from: databricks:databricks-gte-large-en
26+
name: gte-large-en
27+
params:
28+
databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com
29+
databricks_token: ${ secrets:SPICE_DATABRICKS_TOKEN }
30+
```
31+
32+
### Example `spicepod.yaml` configuration, using Databricks service principal
33+
34+
Spice supports the M2M OAuth flow with service principal credentials by utilizing the `databricks_client_id` and `databricks_client_secret` parameters. The runtime will automatically refresh the token.
35+
36+
The service principal must be granted the "Can Query" permission for model serving.
37+
38+
To learn more about how to set up the service principal, see [Databricks M2M OAuth docs](https://docs.databricks.com/aws/en/dev-tools/auth/oauth-m2m).
39+
40+
```yaml
41+
embeddings:
42+
- from: databricks:databricks-gte-large-en
43+
name: gte-large-en
44+
params:
45+
databricks_endpoint: dbc-42424242-4242.cloud.databricks.com
46+
databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}
47+
databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}
48+
```
49+
50+
### Additional Information
51+
52+
Refer to the [Mosaic AI Model Serving documentation](https://docs.databricks.com/aws/en/machine-learning/model-serving/) for more details on available models and configurations.

0 commit comments

Comments
 (0)