Skip to content

[pull] trunk from spiceai:trunk#776

Merged
pull[bot] merged 5 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk
Apr 25, 2026
Merged

[pull] trunk from spiceai:trunk#776
pull[bot] merged 5 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 25, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

claudespice and others added 5 commits April 25, 2026 17:35
…ddingTable (#10520)

* fix: delegate update and delete_from in IndexedTableProvider and EmbeddingTable

Both wrappers delegated insert_into to the underlying provider but
omitted update and delete_from, causing those operations to fall
through to DataFusion's default "not supported" error even when the
underlying table (e.g. DuckDB, Cayenne) supports them.

Since these wrappers form the outermost layer of the provider chain
(IndexedTableProvider → EmbeddingTable → AcceleratedTable), the
missing delegations blocked UPDATE and DELETE for all indexed or
embedding-augmented tables.

* style: apply cargo fmt
…10489)

* feat(devx): make config errors, CLI, and REPL lead users to success

Plumbs existing infrastructure (ParameterSpec docs fields, util::levenshtein,
the generated spicepod JSON schema) into the user-facing surfaces so a typo in
a param, a missing secret reference, or an unknown engine produces an actionable
error instead of a silent drop or a generic "missing required parameter".

CLI & init
- `spice init` writes a `yaml-language-server: $schema=...` directive and
  prints next-step hints; creation message now prints regardless of log level.
- Version banner moves to stderr and only prints on an interactive stderr.
- Log filter honours RUST_LOG when `-v` is not passed.
- 403 responses route to a new `PermissionDenied` variant; Unauthorized
  message points at `spice login` / `SPICE_API_KEY`.

New command: `spice validate [path]`
- Loads a spicepod without starting the runtime; prints component counts on
  success and a structured failure on parse/reference errors. Same codepath as
  the existing `tools/spicepod-validator`.

Parameter + connector errors
- Unknown parameter names now emit a Levenshtein-based "did you mean?" hint
  using the in-tree `util::levenshtein` helper.
- "Missing required parameter" errors include the spec's description, first
  example, and help_link (docs URL) when provided.
- UnknownEngine / UnknownDataConnector / UnknownCatalogConnector /
  UnknownSecretStore messages list the valid values (and for connectors,
  include a Levenshtein suggestion) plus a docs URL.
- Added help_link / examples / one_of to high-value connector params:
  postgres, mysql, s3, kafka, duckdb, snowflake, databricks.
- Improved wording on Snowflake/Postgres/Kafka bare driver errors.

REPL
- Welcome banner shows concrete examples; `help` / `?` lists meta-commands,
  SQL shortcuts, and nql usage.
- New shims for `show schemas` / `show databases` / `describe <table>` (both
  bare and schema-qualified names).
- `.clear` screen-wipe skipped when not on a TTY.

Colour output
- `ansi_colors::Painted` now honours NO_COLOR / FORCE_COLOR and falls back to
  `stdout.is_terminal()`; internal tests force it on via cfg(test).
- Replaced raw ANSI escapes in `dataset configure` with Color helpers so piped
  output and CI logs stay clean.

Secret resolution
- Log messages on secret-ref lookup failure now include the full
  `${ store:KEY }` form, the owning param string, the list of configured
  stores, and a docs URL. Store-init failure gets the same treatment.

Startup summary
- After dispatching dataset loads, runtime logs a count of dispatched/failed.
  A detached task emits a follow-up one-liner at 30s with ready/errored/
  initializing counts (aborts on shutdown).

* fix(devx): address Copilot review comments and cargo fmt

- REPL `.clear`: gate on `stdout().is_terminal()` directly instead of the
  colour heuristic. NO_COLOR should not suppress clear-screen on a TTY, and
  FORCE_COLOR should not push it into a pipe.
- REPL `describe <table>`: match the keyword case-insensitively but keep the
  identifier token in its original case so quoted/mixed-case table names work,
  and escape single quotes (SQL `''`) before interpolating into the
  information_schema lookup.
- `UnknownSecretStore`: build the available-stores list from cfg (adds
  `scheduler_rpc`, drops `keyring` / `aws_secrets_manager` when the feature
  is not enabled) via a new `known_secret_stores()` helper.
- Apply `cargo fmt --all` (was failing the Rust Lint check).

* fix(devx): avoid "failed"/"error" in healthy-startup INFO logs

Quickstart CI asserts that spice.log from a clean start contains no
`(failed|error)` tokens (case-insensitive) — that's its smoke-test for
"nothing went wrong". The new startup summary broke this:

  INFO Loading datasets: 2 dispatched, 0 failed accelerator init (of 2 total).
  INFO Dataset load summary (after 30s): 3/3 ready, 0 errored, 0 still initializing.

Both lines fire on every healthy start, and both contain banned tokens.

Reword both so a healthy run is clean:
- `N failed accelerator init` → `N skipped at accelerator init`
- `M errored` → `M unhealthy`

Per-dataset failures are still logged at WARN by `load_dataset`, so the
smoke-test still catches real problems.

* fix(devx): address second round of Copilot review comments

Code fixes:
- `runtime/src/init/dataset.rs`: skip the 30s follow-up sampler task entirely
  when there are no datasets — no point waiting just to log nothing.
- `crates/repl/src/lib.rs`: `describe schema."MyTable";` now strips quotes
  from each identifier segment independently, so the generated
  `information_schema.columns` filter uses the unquoted table name.

Testability refactor:
- Extract the pure Levenshtein-scoring logic from `suggest_connector` into
  `dataconnector::closest_name(typo, candidates)`, reused by
  `catalogconnector::suggest_catalog_connector`. This keeps the async
  registry lookup and the scoring concerns separate so the scoring can be
  unit-tested directly.

Tests added:
- `runtime-parameters`: `closest_param_suggestion` close-typo match,
  distant-typo miss, and deprecated-spec skip; `describe_missing_parameter`
  empty/description-only/full-suffix/example-only.
- `runtime/dataconnector::tests`: `closest_name` one-char typo, case
  insensitivity, distant returns None, empty candidates, and short-name
  threshold boundary.
- `bin/spice/src/commands/validate.rs`: file path, directory containing
  spicepod, missing path, and invalid YAML — using `tempfile`.

* fix(devx): cargo fmt validate.rs

* fix(devx): address third round of Copilot review comments

- `ansi_colors`: `Painted` now carries a `Target` (Stdout/Stderr). The TTY
  check is cached per stream, so a redirected stderr stays clean when stdout
  is a TTY (and vice versa). New `Color::paint_err()` returns a Painted
  tagged for stderr; `Color::paint()` stays stdout-targeted as before.
  Added 2 unit tests.
- `spice validate`: the "Invalid:" prefix on the stderr failure path now uses
  `paint_err()` so it doesn't inject escapes into a redirected stderr just
  because stdout is a TTY.
- `runtime-secrets::get_store_secret`: allocate the configured-stores list
  lazily inside the two error branches that actually consume it, instead of
  on every substitution. `inject_secrets` calls this in a tight loop, and
  the success path no longer pays for an allocation it discards.
- `runtime::dataconnector::closest_name` doc comment: corrected the example
  from the parameter-typo case (`postgress_host` → `postgres_host`) to the
  connector-name case this helper actually handles (`postgress` → `postgres`).
- `connector-mysql::sslmode`: dropped the overly-restrictive `one_of` that
  rejected modes supported by the underlying driver, and rewrote the
  `sslrootcert` description to stop referencing the Postgres-only
  verify-ca/verify-full mode names.

* fix(ansi_colors): add #[must_use] to colors_enabled[_for]

Workspace lint level is pedantic, which treats must_use_candidate as an
error. Adding the attribute satisfies clippy without suppressing the lint.

* fix(devx): address fourth round of Copilot review comments

- `runtime-parameters::describe_missing_parameter`: rewrite as a `Vec<String>`
  joined with `. ` separators + a `. ` prefix. Previously an example-only spec
  produced a dangling leading space (`Missing required parameter: pg_host
  Example: ...`). Now every non-empty subset reads as a proper trailing
  sentence. Also strips an author-supplied trailing period from `description`
  so we don't concatenate `..`, and adds `description_without_trailing_period`
  / `help_link_only` tests.
- `bin/spice/src/commands/validate.rs`: doc comment on `load_pod` now
  accurately describes the `Err(_)` branch as "metadata fails (not-found,
  permission-denied, …)", matching what the code actually does — falling
  through to `Spicepod::load_exact` lets the open-file step surface the real
  underlying I/O error with the path attached.

* refactor: improve comments and logging for clarity and consistency

* refactor: streamline warning log for unsupported parameters

* fix(runtime): improve logging message for dataset loading summary

* fix(validate): improve error handling for missing and invalid YAML files

* feat(parameters): add support for case-insensitive option in parameter specifications

* test(kafka): add unit test for uppercase values in security protocol
…tion pushdown (#10514)

* fix(rerank): defer execution to RerankExec, add filter/projection pushdown

* Add tests

* Make tests more robust

* feat(rerank): add task_history tracing to rerank UDTF (#10515)

* feat(rerank): add task_history tracing to rerank UDTF

* Update

* Update snapshot

* fix: use NEG_INFINITY for NULL document rows in rerank UDTF (#10512)

* fix: use NEG_INFINITY for NULL document rows in rerank UDTF

NULL document rows were assigned rerank_score = 0.0 when bypassed by the
reranker. This causes NULL rows to sort above valid documents when a
reranker (e.g. Cohere cross-encoder, BYO HTTP endpoint) returns negative
scores, corrupting the ranking output.

Use f32::NEG_INFINITY instead so NULL rows always sort last in DESC order
regardless of the reranker's score range.

* style: rustfmt the new null_rows_sort_below_negative_scores test

---------

Co-authored-by: claudespice <claude@spice.ai>
Co-authored-by: Luke Kim <80174+lukekim@users.noreply.github.com>
…eld (#10523)

* fix(llms): bump mistralrs to include gemma attention_bias fix

Bumps mistralrs/mistralrs-core to the lukim/gemma-attention-bias-optional
branch (spiceai/mistral.rs#36) which makes attention_bias optional in
gemma and gemma2 model configs. Without this, loading Gemma models from
HuggingFace whose config.json omits attention_bias fails with:

  missing field `attention_bias` at line 197 column 1

Also adds a live integration test (huggingface_test_gemma_chat_completion)
mirroring the existing Llama HF test, exercising google/gemma-2-2b-it.

* Apply suggestion from @lukekim

* fix(mistralrs): update git source references to latest commit

* fix(hf): update Gemma model and type to latest version

* fix(hf): refactor huggingface test for gemma chat completion to improve readability and maintainability

* Update crates/runtime/tests/models/hf.rs

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(hf): update Gemma model and snapshot for chat completion test

* fix(hf): update dependencies and add integration test for Gemma 4 model

* fix(hf): enhance loader configuration for multimodal support in MistralLlama

* fix(hf): refactor GGUF loading logic for improved clarity and maintainability

* test(hf): add snapshot for Gemma 4 chat completion response

* fix(hf): add backticks to HuggingFace in doc comments to satisfy clippy::doc_markdown

* docs(hf): document supported model_type architectures including gemma4 and multimodal

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…args (#10527)

* Fix vector_search silently ignoring named limit/column/include_score args

* Fix lint
@pull pull Bot locked and limited conversation to collaborators Apr 25, 2026
@pull pull Bot added the ⤵️ pull label Apr 25, 2026
@pull pull Bot merged commit c93fb72 into TheRakeshPurohit:trunk Apr 25, 2026
2 of 14 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants