[pull] trunk from spiceai:trunk by pull[bot] · Pull Request #829 · TheRakeshPurohit/spiceai

pull · 2026-05-13T09:06:20Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

* fix: show pushed-down `limit` in `tree` EXPLAIN of `DataSourceExec` Upstream DataFusion's `FileScanConfig::fmt_as` already emits `limit=N` for the default/indent format but skips it in `TreeRender`, so a scan that received a fetch limit looked identical to one without in `EXPLAIN FORMAT TREE`. Add `LimitDisplayDataSource`, a transparent `DataSource` wrapper that appends `limit=N` to the tree output when the inner source reports a fetch, and register `DataSourceTreeDisplayOptimizer` to apply the wrap after limit pushdown. `as_any` passes through so downcasts to `FileScanConfig` keep working. Fixes #10602 https://claude.ai/code/session_01AXt4ypyW8znHu1bR5KMjUJ * chore: update copyright header to 2026 Match convention used in other files created in 2026. https://claude.ai/code/session_01AXt4ypyW8znHu1bR5KMjUJ * test(retention): poll for retention SQL to apply before snapshotting The retention task ticks on a 200ms background interval, so the row count after `refresh_table().await` is racy — on slow runners the immediate query observes the unfiltered rows and the snapshot fails. Replace the bare query with a `wait_for_retention_applied` helper that polls every 100ms (10 s budget) until the count matches the snapshot's expected row count, mirroring the explicit sleep used in the unit-test version of retention. https://claude.ai/code/session_01AXt4ypyW8znHu1bR5KMjUJ * test(retention): bound retention poll loop by `timeout_at` Address review feedback on the polling helper: - A slow query could exceed the advertised `timeout` since the deadline was only checked between attempts. - The sleep was not capped by the deadline, so the loop could sleep past it before noticing. Wrap the entire loop in `tokio::time::timeout_at(deadline, …)`. The runtime cancels both the in-flight query and the inter-attempt sleep when the deadline elapses, guaranteeing the function returns within `timeout`. https://claude.ai/code/session_01AXt4ypyW8znHu1bR5KMjUJ * ci(graceful-shutdown-append): tolerate transient parquet-rename races `simulate_append_data.sh` republishes `.spice/service_data.parquet` every 3s by writing a `.tmp` file and `mv -f`-ing it into place. A refresh tick (running on a 1s interval) that observes the rename mid-flight reads a half-published file and the DuckDB reader emits a TProtocolException, which the refresh task logs at WARN and then retries successfully on the next tick. The `Check logs` step's `grep -iE "(failed|error)"` is a case-insensitive substring match, so it treats the transient WARN as a hard failure — even though the runtime handled it correctly and the workload ran to completion. Filter that specific WARN (and the chained `Invalid Error: TProtocolException` line) out of the grep so the step only trips on genuine errors. Scoped to `graceful_shutdown_test_append` since it's the only job that runs against the appending-data simulator. https://claude.ai/code/session_01AXt4ypyW8znHu1bR5KMjUJ * test(limit-display): assert `limit=` lands on its own TreeRender line Strengthen the unit test to verify the TreeRender contract — each `key=value` on its own newline-terminated line — by asserting that `limit=2` appears with a leading `\n` and is not glued to a sibling field. This guards against accidentally regressing the format if a future DataSource impl forgets the trailing `writeln!`. https://claude.ai/code/session_01AXt4ypyW8znHu1bR5KMjUJ * bench(limit-display): measure `DataSourceTreeDisplayOptimizer` cost per query Add a criterion microbenchmark sweeping plan width (1, 10, 50, 100 `DataSourceExec` nodes via `UnionExec`) and the cold (no fetch limit on any scan) vs worst-case (every scan has a fetch limit) paths. The former is the hot path on virtually all queries — every node short- circuits via `Transformed::no` after one `downcast_ref` + a single `fetch().is_none()` check. The latter exercises the wrapping path where each node also pays for an `Arc::clone` of the inner source plus a `LimitDisplayDataSource` allocation and a `with_data_source` swap. Lets reviewers and future maintainers verify the always-on cost stays within a small fraction of typical query planning time. https://claude.ai/code/session_01AXt4ypyW8znHu1bR5KMjUJ * Lint * Lint * Fix test * Fix --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Viktor Yershov <viktor@spice.ai>

…#10757) * docs: add product security audit report Workspace-wide audit of auth, secrets, HTTP/Flight surface, AI/LLM tooling, data connectors, and the container/build pipeline. Two critical findings (non-constant-time API key compare; SSRF in remote UDF endpoint validation), four high (MySQL preferred-mode TLS bypass, Vault tls_skip_verify on remote hosts, AWS default credential chain unverified at init, no auth-failure rate limiting), and a set of medium/low hardening items. Each finding cites file:line and a fix. https://claude.ai/code/session_01Xu4b9TS1UvGu4H1a7NxHKj * fix(security): address critical findings from security audit (C1, C2) C1 — runtime-auth API key allowlist iteration The per-comparison code was already constant-time: ApiKey's PartialEq<str> uses subtle::ct_eq, so per-byte timing does not leak key bytes. However, HTTP, Flight basic, Flight bearer, and gRPC paths all used find / any to search the configured key list, which short-circuits on the first matching key. That leaks the *position* of the matching key (1 comparison vs N) and the existence of any match. Replace find / any with a lookup helper that iterates over every configured key on every call without short-circuiting; total verify time is O(N) regardless of which key (if any) matched. Add tests covering matches at first / middle / last positions and empty-credential rejection on the Flight basic-auth path. C2 — remote UDF endpoint SSRF parse_endpoint previously validated only the URL scheme; any operator- or LLM-influenced UDF declaration could target loopback, RFC1918, or the cloud metadata service via 169.254.169.254 — a practical IAM credential exfiltration path on any cloud deployment. parse_endpoint now classifies the host: * IPv4 literals are rejected if loopback, RFC1918, link-local (covers AWS / GCP / Azure IMDS), multicast, broadcast, unspecified, or carrier-grade NAT (100.64.0.0/10). * IPv6 literals are rejected if loopback, unspecified, multicast, link-local (fe80::/10), unique-local (fc00::/7, covers fd00:ec2::254), or IPv4-mapped onto any of the above ranges. * A short list of conventional metadata hostnames (localhost, metadata, metadata.google.internal) is also blocked. Arbitrary hostnames are still allowed; resolving DNS at builder time introduces a TOCTOU/DNS-rebinding hole, so connect-time enforcement is tracked as follow-up work. * A new allow_private_endpoints: true param opts back in for trusted on-host services. Documented in the file header. Tests cover the default rejection across 12 representative private addresses (loopback, RFC1918, IMDS, IPv6 ULA, IPv4-mapped IPv6, etc.), the explicit opt-in path, public-address acceptance, and an invalid-opt-in-value rejection. Existing round-trip tests that bind a mock HTTP server on 127.0.0.1 are routed through a new with_loopback_optin helper so they keep working. Also update docs/security_audit.md to reflect what was fixed and correct the earlier claim that the per-comparison code was not constant-time. https://claude.ai/code/session_01Xu4b9TS1UvGu4H1a7NxHKj * docs: remove security audit report The audit report was a working artifact; the actionable items live in the code changes (constant-time-position API key lookup, remote-UDF SSRF allowlist) plus their tests. Remaining findings can be tracked as issues rather than as in-tree documentation. https://claude.ai/code/session_01Xu4b9TS1UvGu4H1a7NxHKj * fix: address security PR review feedback * fix: improve security checks for remote UDF endpoints and refactor IP validation logic * fix: resolve ownership issues in IP validation logic for disallowed addresses * fix: use CIDR allowlist for remote UDF endpoints * fix: address final security review comments * fix(security): reject pickle model weights by default; enforce RUSTSEC advisories in CI Two follow-ups motivated by CVE-2026-7482 (Ollama heap OOB read in GGUF tensor parsing). The CVE itself does not affect this codebase — Ollama is not a dependency — but the bug class (malformed binary blob handed to a complex parser) does apply to several places spiceai consumes. 1. Pickle weight rejection -------------------------------------------------------------------- The .pt / .pth / .ckpt / .bin extensions are conventionally Python pickle in PyTorch's ecosystem, and pickle deserialization is RCE by design. The local-LLM file pipeline previously accepted whatever the operator pointed at and handed it straight to mistralrs's loader. Operators commonly download model weights from third-party Hugging Face repos, so a malicious .bin in such a repo would have run arbitrary Rust process code at load time. Add `llms::chat::reject_unsafe_weight_formats` and call it from `runtime::model::chat::file` before invoking `create_local_model`. The check refuses pickle-class extensions by default and accepts a new `params.trust_pickle: true` opt-in for operators who control the source. The check is case-insensitive and covers the four conventional pickle extensions. Five unit tests cover rejection, acceptance of safe formats, opt-in, case-insensitivity, and extensionless paths. 2. Enforce RUSTSEC advisories in CI -------------------------------------------------------------------- The `[advisories]` block in deny.toml was schema-pre-v2 and CI only ran `cargo deny check licenses`. Upgrade the block to `version = 2` (cargo-deny v2 default semantics: vulnerability / unmaintained / unsound / notice all fail; yanked already explicitly denied), and add a second step in `license_check.yml` that runs `cargo deny check advisories`. This intentionally turns on enforcement without pre-populating `ignore` — the workspace currently has known findings (yanked crates, unmaintained warnings, and a small number of actual CVEs in transitive deps) that the team will want to triage explicitly rather than silently bypass. https://claude.ai/code/session_01Xu4b9TS1UvGu4H1a7NxHKj * fix: tighten remote UDF endpoint allowlist handling * fix: correct typo in comment regarding PyTorch pickle file extensions * fix: update dependencies in Cargo.lock and enhance ApiKeyAuth implementation * fix: ungate Path import so reject_unsafe_weight_formats builds without local_llm reject_unsafe_weight_formats and the in-module tests use Path / PathBuf unconditionally, but the imports were gated behind feature = "local_llm". The runtime integration test build (no local_llm feature) failed with E0425 "cannot find type `Path` in this scope". Ungate Path, gate PathBuf to local_llm or test so it isn't unused in lib builds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: downgrade getrandom dependency from 0.4.1 to 0.3.4 * fix: filter remote UDF DNS results * fix: update cargo-deny installation command and improve IP address checks for non-public endpoints * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * fix: specify cargo-deny version and improve error messages in remote functions * fix: update global IP address check to allow 192.0.0.0/24 range --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* Remote permanent errors from DynamoDB Streams * Address feedback * Lint * Fix max backoff * Apply suggestions from code review Co-authored-by: Phillip LeBlanc <879445+phillipleblanc@users.noreply.github.com> --------- Co-authored-by: Phillip LeBlanc <879445+phillipleblanc@users.noreply.github.com>

lukekim and others added 3 commits May 13, 2026 04:01

pull Bot locked and limited conversation to collaborators May 13, 2026

pull Bot added the ⤵️ pull label May 13, 2026

pull Bot merged commit de81ad3 into TheRakeshPurohit:trunk May 13, 2026
3 of 29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] trunk from spiceai:trunk#829

[pull] trunk from spiceai:trunk#829
pull[bot] merged 3 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk

pull Bot commented May 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pull Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pull Bot commented May 13, 2026 •

edited

Loading