Skip to content

[pull] trunk from spiceai:trunk#50

Merged
pull[bot] merged 3 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk
Apr 24, 2025
Merged

[pull] trunk from spiceai:trunk#50
pull[bot] merged 3 commits into
TheRakeshPurohit:trunkfrom
spiceai:trunk

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 24, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

github-actions Bot and others added 3 commits April 24, 2025 12:35
* fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/federated/mysql.yaml

* trigger ci to run

* fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/federated/file[parquet].yaml

* fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/federated/databricks[delta_lake].yaml

* fix: Update the tpcds benchmark snapshots for: ./test/spicepods/tpcds/sf1/federated/s3[parquet].yaml

* fix: Update the tpcds benchmark snapshots for: ./test/spicepods/tpcds/sf1/federated/abfs[parquet].yaml

* fix: Update the tpcds benchmark snapshots for: ./test/spicepods/tpcds/sf1/federated/dremio.yaml

* fix: Update the tpcds benchmark snapshots for: ./test/spicepods/tpcds/sf1/accelerated/file[parquet]-duckdb[file].yaml

* fix: Update the tpcds benchmark snapshots for: ./test/spicepods/tpcds/sf1/accelerated/file[parquet]-sqlite[memory].yaml

* fix: Update the tpcds benchmark snapshots for: ./test/spicepods/tpcds/sf1/accelerated/file[parquet]-duckdb[memory].yaml

* fix: Update the clickbench benchmark snapshots for: ./test/spicepods/clickbench/sf1/accelerated/s3[parquet]-sqlite[memory].yaml

* Update test_framework__snapshot__databricks[delta_lake]-federated_tpch_q10_explain.snap

---------

Co-authored-by: Spice Benchmark Snapshot Update Bot <spiceaibot@spice.ai>
Co-authored-by: Sevenannn <qianqliu@uw.edu>
Co-authored-by: peasee <98815791+peasee@users.noreply.github.com>
…res. (#5528)

* Use mimalloc by default

* REVERT: debug

* Use Jemalloc

* fix

* system alloc

* snmalloc

* debug

* Support overriding the snmalloc memory allocator that Spice uses by default

* Improve spiced_docker pipeline to build memory allocator flavors

* Generate spiced_docker.yml from YAML anchor version.

* Also update spiced_docker_nightly.yml

* Fix alloc-system

* Fix default build
* Add test spicepod for tpch mysql-duckdb[file acceleration]

* fix test spicepod name

* fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/accelerated/mysql-duckdb[file].yaml

* trigger ci to run

---------

Co-authored-by: Sevenannn <qianqliu@uw.edu>
Co-authored-by: Spice Benchmark Snapshot Update Bot <spiceaibot@spice.ai>
@pull pull Bot added the ⤵️ pull label Apr 24, 2025
@pull pull Bot merged commit e954475 into TheRakeshPurohit:trunk Apr 24, 2025
pull Bot pushed a commit that referenced this pull request Apr 21, 2026
spiceai#10365)

* Harden /v1/tools and /v1/nsql against unauthenticated / LLM-driven SQL

Addresses threat model items #50 and #51 (docs/threat_models/v2.0.0.md):

- Add strict read-only SQL validator (validate_sql_query_read_only) that
  rejects every DDL/DML/COPY/non-prepared Statement node regardless of
  per-catalog writability.
- Plumb a read_only flag through QueryBuilder/Query and apply the
  validator at all three plan execution sites (local, Ballista, async).
- Default the built-in `sql` tool to read-only; operators may opt in via
  SqlTool::allow_writes(). LLM tool-use can no longer mutate data through
  the sql tool.
- Run LLM-generated SQL from /v1/nsql under the read-only validator so
  prompt-injection-driven writes cannot reach writable catalogs.
- Gate /v1/tools/* behind a require_auth_configured middleware: when
  runtime.auth is not set, these routes return 401 rather than invoking
  tool.call anonymously with attacker-controlled bodies.
- Record the new mitigations in the v2.0.0 threat model.

* refactor: clarify read-only SQL validation comments and enhance documentation for DDL/DML restrictions

* Refactor authentication error response to use JSON format and add SQL tool descriptions for read-only and writable modes

* Fix collapsible_if clippy lint in read-only validation path

* Reject write-capable extension nodes in read-only validator

Spice's planner can represent DDL/DML as LogicalPlan::Extension nodes
(DdlExtensionNode, DmlExtensionNode, DistributedCayenne{Insert,Update,
Delete,Merge}Node, CayenneMergeNode). The previous read-only validator
only matched Ddl/Dml/Copy/Statement and would have let those plan shapes
through, defeating the read-only guarantee on /v1/tools/sql and /v1/nsql.

- Add Extension arm to validate_sql_query_read_only that denies any node
  whose UserDefinedLogicalNodeCore::name matches a curated list of
  write-capable extension names.
- Test the deny mechanism with a stub UserDefinedLogicalNode and verify
  a non-write extension name is still allowed.
- Add an integration test that exercises Spice's create_logical_plan
  wrapper end-to-end (cfg(not(windows))).
- Reflect the PREPARE/EXECUTE/DEALLOCATE rejection in the SqlTool
  read-only description so LLM/tool-selection logic knows the posture.
- Replace the PR-contextual 'Unverified in this review' phrasing in the
  threat model with the durable 'Unverified mitigation'.

* Bypass SQL results cache for read-only query paths

When ctx.read_only is set (e.g. the /v1/tools/sql read-only tool and the
/v1/nsql LLM SQL path), both the SQL-keyed and plan-keyed results-cache
lookups are now skipped inside get_plan_or_cached, and the returned
RequestCacheManager is forced to CacheDisabled. Previously, a cache hit
from a prior writable execution could short-circuit
validate_sql_query_read_only, letting a cached result produced by a
write-capable plan (e.g. LogicalPlan::Extension nodes like DmlExtension
or DistributedCayenneInsert) be served on a read-only surface.

Also move WRITE_CAPABLE_EXTENSION_NAMES into the cache crate as the
single source of truth, and extend cache_is_enabled_for_plan to reject
write-capable LogicalPlan::Extension nodes. Defense-in-depth: even on
writable paths, write-capable extension plans are now never cached or
populated in the results cache.

* fix: flatten write-capable extension check to match guard in cache eligibility

Removes one level of nesting as requested in review.

---------

Co-authored-by: Viktor Yershov <viktor@spice.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant