Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/actions/setup-chbench-postgres/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,12 @@ runs:
docker run -d --name chbench-pg \
-e POSTGRES_USER=bench -e POSTGRES_PASSWORD=bench -e POSTGRES_DB=chbench \
-p 5432:5432 postgres:16 \
-c max_connections=150 \
-c max_connections=200 \
-c wal_level=logical \
-c synchronous_commit=off \
-c max_replication_slots=20 \
-c max_wal_senders=20
-c max_wal_senders=20 \
-c shared_buffers=128MB

- name: Wait for CH-benCH PostgreSQL
shell: bash
Expand Down
16 changes: 12 additions & 4 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ Spice is a SQL query, search, and LLM-inference engine in Rust for data apps and

**Never force push** — not on `trunk`, not on feature branches, not even with `--force-with-lease`. Always merge or rebase normally, then push without force.

- **Why force-push is banned**: It silently destroys collaborator commits and orphans PR review history (comments lose their anchor, reviewers re-read the entire diff, CI re-runs). `--force-with-lease` only protects against the *latest fetch* — it cannot see commits a collaborator pushed since you fetched, so it does not make force-push safe on shared branches.
- **Why force-push is banned**: It silently destroys collaborator commits and orphans PR review history (comments lose their anchor, reviewers re-read the entire diff, CI re-runs). `--force-with-lease` only protects against the _latest fetch_ — it cannot see commits a collaborator pushed since you fetched, so it does not make force-push safe on shared branches.
- **What to do instead of force-push**:
- Branch out of date with `trunk`? `git pull --rebase` or `git merge trunk`, then `git push` normally.
- Want to fix history on a branch with open review? Add a follow-up commit and squash on merge — don't rewrite published commits.
- Pre-commit hook failed and the commit didn't actually land? Re-stage, fix, create a *new* commit. Never `--amend` after pushing.
- Pre-commit hook failed and the commit didn't actually land? Re-stage, fix, create a _new_ commit. Never `--amend` after pushing.
- **Never bypass hooks** (`--no-verify`, etc.). If a hook fails, fix the underlying issue — these checks exist because earlier failures escaped review. Likewise, don't bypass required commit signing (e.g. `--no-gpg-sign`) just to get a commit through.
- **Investigate before destroying**: unfamiliar files, branches, or lock files may represent in-progress work. Don't `git reset --hard`, `checkout --`, or `clean -f` to "make it go away" — find the root cause first.

Expand Down Expand Up @@ -63,6 +63,14 @@ cargo run -p testoperator -- run bench -p ./test/spicepods/tpch/sf1/federated/du

## Rust Coding Standards

### Configuration & User-Facing Parameters

- **Avoid boolean parameters in user-facing configuration** (Spicepod fields, connector `params`, CLI flags, public API options). Booleans paint you into a corner the moment a third state is needed and force readers to know which value means "on": `on_schema_change: append_new_columns` reads better than `schema_evolution_enabled: true`, and leaves room for `block`, `fail`, `sync_all_columns`, … without breaking changes.
- **Prefer named enum variants** (`#[serde(rename_all = "snake_case")]`). Pick verbs/states that describe behavior, not capability — `block` / `fail` / `append_new_columns` / `sync_all_columns`, not `disabled` / `enabled` / `auto`. Default to the conservative or back-compat-preserving variant via `#[derive(Default)]` + `#[default]` so the safe behavior is what users get when they omit the field.
- **Mirror precedent** already in the codebase: `on_zero_results: return_empty|use_source`, `unsupported_type_action: error|warn|ignore|string`, `ready_state: on_load|on_registration|on_schema_resolved`, `check_availability: auto|disabled`, `on_schema_change: block|fail|append_new_columns|sync_all_columns`. Reach for an existing enum-shaped pattern before inventing a new boolean.
- **Make each connector or engine explicitly opt in** when a setting depends on implementation behavior. Add a trait or capability method whose default returns only the modes that work universally, validate user configuration against that list, and surface a structured configuration error instead of silently ignoring unsupported modes. Audit every wrapper/decorator impl (`EmbeddingConnector`, `FullTextConnector`, `DeferredConnector`, …) to forward new trait methods to the inner connector; see "Trait Evolution & Wrapper Delegation" below.
- **Booleans are still fine in internal, non-config code paths** (struct fields, function arguments, in-memory flags, test helpers). This rule is about _what users type into YAML / pass on the CLI_, not about Rust's primitive types.

### Rust Version Baseline

- **Workspace Rust version is 1.94.1**: Treat Rust 1.94.1 as the minimum supported compiler version for workspace code unless a specific crate or integration explicitly documents a different constraint.
Expand All @@ -75,7 +83,7 @@ cargo run -p testoperator -- run bench -p ./test/spicepods/tpch/sf1/federated/du
- **Use SNAFU**: Derive `Snafu` and `Debug` on error enums
- **NO `.unwrap()`/`.expect()` in non-test code**: Use `?` operator or `match`
- **In tests**: Use `.expect("descriptive message")` instead of `.unwrap()`
- **`unreachable!()` / `unimplemented!()` / `todo!()`**: Only for *provably unreachable* code. Never for unfinished-but-callable code — they panic at runtime, which violates the data-correctness rule of failing safely with a structured error. For not-yet-implemented method bodies, return a typed error (`DataFusionError::NotImplemented("...")`, an `Err(NotImplementedSnafu { ... })` variant, etc.) so callers can degrade gracefully or surface a useful message
- **`unreachable!()` / `unimplemented!()` / `todo!()`**: Only for _provably unreachable_ code. Never for unfinished-but-callable code — they panic at runtime, which violates the data-correctness rule of failing safely with a structured error. For not-yet-implemented method bodies, return a typed error (`DataFusionError::NotImplemented("...")`, an `Err(NotImplementedSnafu { ... })` variant, etc.) so callers can degrade gracefully or surface a useful message
- **Use `ensure!` macro**: Preferred over `if` + `return Err`
- **Define `Result` type alias**: `pub type Result<T, E = Error> = std::result::Result<T, E>;`
- **Don't use `assert!()` (or related) macros in non-test code**: Prefer proper error handling, or marking with `unreachable!()` if the assertion is truly unreachable. Alternatively, make the assertion a `debug_assert!()` assertion to only fire in debug builds instead of release builds. `assert!()` macros can have case-by-case exceptions, for example for compile-time assertions that would prevent a build from being released to begin with.
Expand Down Expand Up @@ -477,7 +485,7 @@ export PATH="$PATH:$HOME/.spice/bin"
### Async Patterns

- Use `tokio` runtime (see `bin/spiced/src/main.rs`).
- **Trait async methods**: prefer `#[async_trait]`. Native `async fn` in traits has been stable since Rust 1.75 and is fine for traits that *don't* need to be `dyn`-compatible — but most internal traits in this codebase (`DataConnector`, `Chat`, `Embed`, `SecretStore`, `Index`, `CacheProvider`, etc.) are stored as `Arc<dyn Trait>`, and native AFIT isn't `dyn`-safe without manual workarounds. Default to `async_trait` to keep the dyn path consistent; reach for native AFIT only on non-dyn helper traits.
- **Trait async methods**: prefer `#[async_trait]`. Native `async fn` in traits has been stable since Rust 1.75 and is fine for traits that _don't_ need to be `dyn`-compatible — but most internal traits in this codebase (`DataConnector`, `Chat`, `Embed`, `SecretStore`, `Index`, `CacheProvider`, etc.) are stored as `Arc<dyn Trait>`, and native AFIT isn't `dyn`-safe without manual workarounds. Default to `async_trait` to keep the dyn path consistent; reach for native AFIT only on non-dyn helper traits.
- **Lazy globals**: prefer `std::sync::LazyLock` / `OnceLock` (modern stable Rust) over `lazy_static!` and `once_cell::sync::Lazy` for new code. Existing `once_cell` callsites are fine to leave.
- Use `CancellationToken` for shutdown (see `runtime/src/cancellable_task.rs`).

Expand Down
129 changes: 95 additions & 34 deletions .github/prompts/writeReleaseNotes.prompt.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,105 @@
---
name: writeReleaseNotes
description: Write release notes for a new version based on git history and previous release notes style.
argument-hint: The new version tag and the previous version tag to diff against (e.g. "v2.0.0-rc.1 since v1.11.2")
description: Write or update release notes for a new version based on git history and previous release notes style.
argument-hint: The new version tag and the previous version tag to diff against, or the literal word "update" to refresh an existing in-progress release notes file with new commits since it was last edited (e.g. "v2.0.0-rc.5 since v2.0.0-rc.4", or "update").
---

Write release notes for the specified version based on all changes since the specified previous release.
Write or update release notes for the specified version based on all changes since the specified previous release.

Follow these steps:
## Modes

1. **Study previous release notes** in the `docs/release_notes/` directory to understand the structure, tone, and style used by the project. Pay attention to:
- Header format and date conventions
- How features are grouped and described
- Level of technical detail per feature
- Breaking changes section format
- Contributors section format
- **Create**: No release notes file exists yet for the version. Build the file from scratch following all steps below.
- **Update**: Argument is `update` (or the release notes file already exists). Treat the existing file as the source of truth for tone, ordering and editorial decisions, and only ADD entries for commits landed on `origin/trunk` since the file was last edited. Do not rewrite or reorder existing sections unless the user asks explicitly.

To detect update mode and find the relevant commit range:

- Check whether `docs/release_notes/<version>.md` exists.
- Find when it was last edited: `git log -1 --format='%H' docs/release_notes/<version>.md`
- New commits to consider: `git fetch origin && git log <last-edit-sha>..origin/trunk --no-merges --format='%h | %an | %s'`

## Steps

1. **Study previous release notes** in `docs/release_notes/` to understand the structure, tone, and style used by the project. Pay attention to:
- Header format and date conventions (e.g. `# Spice vX.Y.Z (Month D, YYYY)`)
- Opening summary paragraph that names the release-defining themes
- Bulleted "Highlights in this release candidate include:" list right under the summary
- How features are grouped and described (subsections under `## What's New`)
- Level of technical detail per feature; preferred prose vs. bullets; example YAML snippets
- Enterprise feature callout style (`> [Spice.ai Enterprise]...` blockquote at the top of the subsection)
- Breaking Changes section format with migration before/after YAML
- Contributors section format (GitHub profile links, alphabetised)
- Upgrading instructions format
- Changelog format with PR links and author attribution

2. **Gather changes** using `git log` between the two version tags:
- Get all non-merge commits with `git log <prev-tag>..<new-tag> --oneline --no-merges` (or `git log <prev-tag>..HEAD` if the new tag does not exist yet)
- Identify contributors with `git log <prev-tag>..<new-tag> --format="%an <%ae>" --no-merges | sort -u` (or `git log <prev-tag>..HEAD` if the new tag does not exist yet)
- Map contributor names to GitHub usernames by cross-referencing previous release notes and email addresses
- Read commit messages for major PRs to understand feature scope and details

3. **Categorize changes** into:
- Major new features (deserve their own subsection with description, key points, and examples)
- Dependency upgrades (presented in a table)
- Other improvements (bullet list of smaller features and fixes)
- Breaking changes (with migration guidance)
- Bug fixes (grouped by area)

4. **Write the release notes** matching the established style:
- Opening summary paragraph highlighting the most important features
- "What's New" section with subsections for each major feature
- Contributors list with GitHub profile links
- Breaking Changes section
- Cookbook Updates section
- Upgrading section with CLI, Homebrew, Docker, Helm, and marketplace instructions
- Changelog section with PR links and author attribution

5. **Filter noise** from the changelog: exclude CI fixes, test snapshot updates, dependabot bumps, internal refactors, and other non-user-facing changes from the summary sections (but include significant ones in the detailed changelog).

Save the release notes as a new markdown file in the release notes directory.
- All non-merge commits: `git log <prev-tag>..<new-tag> --oneline --no-merges` (or `..HEAD` if the new tag does not exist yet; or `<last-edit-sha>..origin/trunk` in update mode).
- For each non-trivial PR, look up the PR title, body and author on GitHub. Commit subjects often lack the user-facing framing the release notes need.
- PR metadata: `gh pr view <num> --json title,body,author,labels`
- Author handle (use this when the commit author name is ambiguous, bot-mangled, or differs from GitHub login): `gh pr view <num> --json author -q '.author.login'`
- Identify contributors: `git log <range> --format='%an <%ae>' --no-merges | sort -u`, then map each to a GitHub username using `gh pr view` on one of their PRs and by cross-referencing previous release notes.

3. **Filter noise**. Exclude entirely from both the narrative and the changelog:
- `dependabot[bot]` and `github-actions[bot]` commits unless they update a user-visible dependency (e.g. DuckDB, Iceberg, Turso) — those go in the dependency table.
- Test/snapshot updates (`fix(tests): ...`, `chore(benchmarks): ...`, `Update snapshots`, `Disable failing ... test in CI`).
- Internal refactors with no user-visible behaviour change (e.g. lint deny attributes, internal trait reshuffles).
- Reverts of changes that never shipped in a prior release.
- `chore: Clean up Cargo.lock`-style housekeeping.
- Significant internal changes (e.g. CI infrastructure rewrites) MAY be included in the detailed changelog at the bottom but never in the highlights or `## What's New`.

4. **Categorize changes** into:
- Major new features (deserve their own `### Subsection` with description, key points, and YAML examples when configuration changes).
- Dependency upgrades (presented in a table at the end of `## What's New`).
- Smaller improvements (bullet list under broader subsections such as `### SQL, Query, and Developer Experience` or `### Caching & Search`).
- Breaking changes (with before/after migration guidance).
- Bug fixes (grouped by area, e.g. `### Connector Bug Fixes`).

5. **Write the release notes** matching the established style:
- `# Spice v<version> (<Month D, YYYY>)` header followed by a one-paragraph summary naming the headline themes.
- `Highlights in this release candidate include:` bullet list.
- `## What's New in v<version>` with subsections for each major feature.
- `## Contributors` with GitHub profile links, alphabetised case-insensitively.
- `## Breaking Changes` (omit if none).
- `## Cookbook Updates` (state "No new cookbook recipes." if none).
- `## Upgrading` with CLI, Homebrew, Docker, Helm, and AWS Marketplace instructions.
- `## What's Changed` → `### Changelog` with one bullet per included PR in the form
`- <title> by [@handle](https://github.com/<handle>) in [#<num>](https://github.com/spiceai/spiceai/pull/<num>)`
- `**Full Changelog**: <https://github.com/spiceai/spiceai/compare/<prev-tag>...<new-tag>>`

## Ordering

Within both the Highlights bullets and the `## What's New` subsections, **keep the two lists in the same relative order** so the reader can move between them without surprise.

Default thematic order for highlights/subsections, top to bottom:

1. **Spice Cayenne** — always first when there is meaningful Cayenne news.
2. Security & TLS (mTLS, auth)
3. CDC sources (MongoDB Change Streams, Kafka offsets, Debezium fixes)
4. DML / write-back (PostgreSQL, Snowflake, Arrow upserts, DuckLake Beta)
5. SQL & UDFs (User-Defined Functions, Spatial SQL UDFs)
6. Runtime features (On-Demand Dataset Loading, SMB client, Unified Cancellation)
7. HTTP / connector improvements (Dynamic HTTP Connector, HTTP rate-control persistence)
8. Acceleration (`refresh_mode: snapshot`, new accelerator features)
9. AI / LLM (Prompt caching, Responses API)
10. Cross-cutting trailing sections inside `## What's New`: Distributed Cluster Improvements → Caching & Search → Security Improvements → SQL/Developer Experience → Connector Bug Fixes → Dependency Updates.

## Project-specific conventions

- The product surface name is **Spice Cayenne** in narrative prose (highlights, opening paragraph). Inside subsections about Cayenne internals, plain "Cayenne" is fine after the first mention.
- [@claudespice](https://github.com/claudespice) is a bot and **must not** appear in the `## Contributors` section. It may appear in the `### Changelog` author attribution because that follows the PR's actual author.
- Use `## What's Changed` then `### Changelog` (not `## Changelog`) to match the GitHub auto-generated layout that prior releases mirror.
- Verify each PR is referenced at most once in the changelog and at most once in the narrative `## What's New`. Use `grep -c '#<num>' <file>` to spot-check.
- When updating an existing file, also update the **Contributors** list if the new commits introduce a new author. Skip bots and `claudespice`.

## Documentation links

Link feature names in subsections to the appropriate documentation host. Pick the host based on where the feature is documented, not by audience:

- **OSS / runtime features** → `https://spiceai.org/docs` (e.g. `https://spiceai.org/docs/components/data-connectors/postgres`).
- **Spice.ai Cloud** → `https://docs.spice.ai/docs` (e.g. `https://docs.spice.ai/docs/api/sql`).
- **Spice.ai Enterprise** → `https://docs.spice.ai/docs/enterprise` (e.g. `https://docs.spice.ai/docs/enterprise/features/distributed-accelerations`). Enterprise subsections also get the `> [Spice.ai Enterprise](https://docs.spice.ai/docs/enterprise) feature. See [...](<deep-link>).` blockquote at the top.

Verify any new doc link you introduce actually resolves. If a deep link cannot be confirmed, link to the section root instead.

## Output

Save the release notes as `docs/release_notes/v<version>.md`. In update mode, edit the existing file in place and commit with a `docs(release): update v<version> notes with latest trunk PRs` style message.
Loading
Loading