[pull] trunk from spiceai:trunk by pull[bot] · Pull Request #776 · TheRakeshPurohit/spiceai

pull · 2026-04-25T21:06:14Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

…ddingTable (#10520) * fix: delegate update and delete_from in IndexedTableProvider and EmbeddingTable Both wrappers delegated insert_into to the underlying provider but omitted update and delete_from, causing those operations to fall through to DataFusion's default "not supported" error even when the underlying table (e.g. DuckDB, Cayenne) supports them. Since these wrappers form the outermost layer of the provider chain (IndexedTableProvider → EmbeddingTable → AcceleratedTable), the missing delegations blocked UPDATE and DELETE for all indexed or embedding-augmented tables. * style: apply cargo fmt

…10489) * feat(devx): make config errors, CLI, and REPL lead users to success Plumbs existing infrastructure (ParameterSpec docs fields, util::levenshtein, the generated spicepod JSON schema) into the user-facing surfaces so a typo in a param, a missing secret reference, or an unknown engine produces an actionable error instead of a silent drop or a generic "missing required parameter". CLI & init - `spice init` writes a `yaml-language-server: $schema=...` directive and prints next-step hints; creation message now prints regardless of log level. - Version banner moves to stderr and only prints on an interactive stderr. - Log filter honours RUST_LOG when `-v` is not passed. - 403 responses route to a new `PermissionDenied` variant; Unauthorized message points at `spice login` / `SPICE_API_KEY`. New command: `spice validate [path]` - Loads a spicepod without starting the runtime; prints component counts on success and a structured failure on parse/reference errors. Same codepath as the existing `tools/spicepod-validator`. Parameter + connector errors - Unknown parameter names now emit a Levenshtein-based "did you mean?" hint using the in-tree `util::levenshtein` helper. - "Missing required parameter" errors include the spec's description, first example, and help_link (docs URL) when provided. - UnknownEngine / UnknownDataConnector / UnknownCatalogConnector / UnknownSecretStore messages list the valid values (and for connectors, include a Levenshtein suggestion) plus a docs URL. - Added help_link / examples / one_of to high-value connector params: postgres, mysql, s3, kafka, duckdb, snowflake, databricks. - Improved wording on Snowflake/Postgres/Kafka bare driver errors. REPL - Welcome banner shows concrete examples; `help` / `?` lists meta-commands, SQL shortcuts, and nql usage. - New shims for `show schemas` / `show databases` / `describe <table>` (both bare and schema-qualified names). - `.clear` screen-wipe skipped when not on a TTY. Colour output - `ansi_colors::Painted` now honours NO_COLOR / FORCE_COLOR and falls back to `stdout.is_terminal()`; internal tests force it on via cfg(test). - Replaced raw ANSI escapes in `dataset configure` with Color helpers so piped output and CI logs stay clean. Secret resolution - Log messages on secret-ref lookup failure now include the full `${ store:KEY }` form, the owning param string, the list of configured stores, and a docs URL. Store-init failure gets the same treatment. Startup summary - After dispatching dataset loads, runtime logs a count of dispatched/failed. A detached task emits a follow-up one-liner at 30s with ready/errored/ initializing counts (aborts on shutdown). * fix(devx): address Copilot review comments and cargo fmt - REPL `.clear`: gate on `stdout().is_terminal()` directly instead of the colour heuristic. NO_COLOR should not suppress clear-screen on a TTY, and FORCE_COLOR should not push it into a pipe. - REPL `describe <table>`: match the keyword case-insensitively but keep the identifier token in its original case so quoted/mixed-case table names work, and escape single quotes (SQL `''`) before interpolating into the information_schema lookup. - `UnknownSecretStore`: build the available-stores list from cfg (adds `scheduler_rpc`, drops `keyring` / `aws_secrets_manager` when the feature is not enabled) via a new `known_secret_stores()` helper. - Apply `cargo fmt --all` (was failing the Rust Lint check). * fix(devx): avoid "failed"/"error" in healthy-startup INFO logs Quickstart CI asserts that spice.log from a clean start contains no `(failed|error)` tokens (case-insensitive) — that's its smoke-test for "nothing went wrong". The new startup summary broke this: INFO Loading datasets: 2 dispatched, 0 failed accelerator init (of 2 total). INFO Dataset load summary (after 30s): 3/3 ready, 0 errored, 0 still initializing. Both lines fire on every healthy start, and both contain banned tokens. Reword both so a healthy run is clean: - `N failed accelerator init` → `N skipped at accelerator init` - `M errored` → `M unhealthy` Per-dataset failures are still logged at WARN by `load_dataset`, so the smoke-test still catches real problems. * fix(devx): address second round of Copilot review comments Code fixes: - `runtime/src/init/dataset.rs`: skip the 30s follow-up sampler task entirely when there are no datasets — no point waiting just to log nothing. - `crates/repl/src/lib.rs`: `describe schema."MyTable";` now strips quotes from each identifier segment independently, so the generated `information_schema.columns` filter uses the unquoted table name. Testability refactor: - Extract the pure Levenshtein-scoring logic from `suggest_connector` into `dataconnector::closest_name(typo, candidates)`, reused by `catalogconnector::suggest_catalog_connector`. This keeps the async registry lookup and the scoring concerns separate so the scoring can be unit-tested directly. Tests added: - `runtime-parameters`: `closest_param_suggestion` close-typo match, distant-typo miss, and deprecated-spec skip; `describe_missing_parameter` empty/description-only/full-suffix/example-only. - `runtime/dataconnector::tests`: `closest_name` one-char typo, case insensitivity, distant returns None, empty candidates, and short-name threshold boundary. - `bin/spice/src/commands/validate.rs`: file path, directory containing spicepod, missing path, and invalid YAML — using `tempfile`. * fix(devx): cargo fmt validate.rs * fix(devx): address third round of Copilot review comments - `ansi_colors`: `Painted` now carries a `Target` (Stdout/Stderr). The TTY check is cached per stream, so a redirected stderr stays clean when stdout is a TTY (and vice versa). New `Color::paint_err()` returns a Painted tagged for stderr; `Color::paint()` stays stdout-targeted as before. Added 2 unit tests. - `spice validate`: the "Invalid:" prefix on the stderr failure path now uses `paint_err()` so it doesn't inject escapes into a redirected stderr just because stdout is a TTY. - `runtime-secrets::get_store_secret`: allocate the configured-stores list lazily inside the two error branches that actually consume it, instead of on every substitution. `inject_secrets` calls this in a tight loop, and the success path no longer pays for an allocation it discards. - `runtime::dataconnector::closest_name` doc comment: corrected the example from the parameter-typo case (`postgress_host` → `postgres_host`) to the connector-name case this helper actually handles (`postgress` → `postgres`). - `connector-mysql::sslmode`: dropped the overly-restrictive `one_of` that rejected modes supported by the underlying driver, and rewrote the `sslrootcert` description to stop referencing the Postgres-only verify-ca/verify-full mode names. * fix(ansi_colors): add #[must_use] to colors_enabled[_for] Workspace lint level is pedantic, which treats must_use_candidate as an error. Adding the attribute satisfies clippy without suppressing the lint. * fix(devx): address fourth round of Copilot review comments - `runtime-parameters::describe_missing_parameter`: rewrite as a `Vec<String>` joined with `. ` separators + a `. ` prefix. Previously an example-only spec produced a dangling leading space (`Missing required parameter: pg_host Example: ...`). Now every non-empty subset reads as a proper trailing sentence. Also strips an author-supplied trailing period from `description` so we don't concatenate `..`, and adds `description_without_trailing_period` / `help_link_only` tests. - `bin/spice/src/commands/validate.rs`: doc comment on `load_pod` now accurately describes the `Err(_)` branch as "metadata fails (not-found, permission-denied, …)", matching what the code actually does — falling through to `Spicepod::load_exact` lets the open-file step surface the real underlying I/O error with the path attached. * refactor: improve comments and logging for clarity and consistency * refactor: streamline warning log for unsupported parameters * fix(runtime): improve logging message for dataset loading summary * fix(validate): improve error handling for missing and invalid YAML files * feat(parameters): add support for case-insensitive option in parameter specifications * test(kafka): add unit test for uppercase values in security protocol

…tion pushdown (#10514) * fix(rerank): defer execution to RerankExec, add filter/projection pushdown * Add tests * Make tests more robust * feat(rerank): add task_history tracing to rerank UDTF (#10515) * feat(rerank): add task_history tracing to rerank UDTF * Update * Update snapshot * fix: use NEG_INFINITY for NULL document rows in rerank UDTF (#10512) * fix: use NEG_INFINITY for NULL document rows in rerank UDTF NULL document rows were assigned rerank_score = 0.0 when bypassed by the reranker. This causes NULL rows to sort above valid documents when a reranker (e.g. Cohere cross-encoder, BYO HTTP endpoint) returns negative scores, corrupting the ranking output. Use f32::NEG_INFINITY instead so NULL rows always sort last in DESC order regardless of the reranker's score range. * style: rustfmt the new null_rows_sort_below_negative_scores test --------- Co-authored-by: claudespice <claude@spice.ai> Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com>

@lukekim

…eld (#10523) * fix(llms): bump mistralrs to include gemma attention_bias fix Bumps mistralrs/mistralrs-core to the lukim/gemma-attention-bias-optional branch (spiceai/mistral.rs#36) which makes attention_bias optional in gemma and gemma2 model configs. Without this, loading Gemma models from HuggingFace whose config.json omits attention_bias fails with: missing field `attention_bias` at line 197 column 1 Also adds a live integration test (huggingface_test_gemma_chat_completion) mirroring the existing Llama HF test, exercising google/gemma-2-2b-it. * Apply suggestion from @lukekim * fix(mistralrs): update git source references to latest commit * fix(hf): update Gemma model and type to latest version * fix(hf): refactor huggingface test for gemma chat completion to improve readability and maintainability * Update crates/runtime/tests/models/hf.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * fix(hf): update Gemma model and snapshot for chat completion test * fix(hf): update dependencies and add integration test for Gemma 4 model * fix(hf): enhance loader configuration for multimodal support in MistralLlama * fix(hf): refactor GGUF loading logic for improved clarity and maintainability * test(hf): add snapshot for Gemma 4 chat completion response * fix(hf): add backticks to HuggingFace in doc comments to satisfy clippy::doc_markdown * docs(hf): document supported model_type architectures including gemma4 and multimodal --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…args (#10527) * Fix vector_search silently ignoring named limit/column/include_score args * Fix lint

claudespice and others added 5 commits April 25, 2026 17:35

Fix vector_search silently ignoring named limit/column/include_score …

c93fb72

…args (#10527) * Fix vector_search silently ignoring named limit/column/include_score args * Fix lint

pull Bot locked and limited conversation to collaborators Apr 25, 2026

pull Bot added the ⤵️ pull label Apr 25, 2026

pull Bot merged commit c93fb72 into TheRakeshPurohit:trunk Apr 25, 2026
2 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] trunk from spiceai:trunk#776

[pull] trunk from spiceai:trunk#776
pull[bot] merged 5 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk

pull Bot commented Apr 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pull Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pull Bot commented Apr 25, 2026 •

edited

Loading