You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Harden /v1/tools and /v1/nsql against unauthenticated / LLM-driven SQL (spiceai#10365)
* Harden /v1/tools and /v1/nsql against unauthenticated / LLM-driven SQL
Addresses threat model items #50 and #51 (docs/threat_models/v2.0.0.md):
- Add strict read-only SQL validator (validate_sql_query_read_only) that
rejects every DDL/DML/COPY/non-prepared Statement node regardless of
per-catalog writability.
- Plumb a read_only flag through QueryBuilder/Query and apply the
validator at all three plan execution sites (local, Ballista, async).
- Default the built-in `sql` tool to read-only; operators may opt in via
SqlTool::allow_writes(). LLM tool-use can no longer mutate data through
the sql tool.
- Run LLM-generated SQL from /v1/nsql under the read-only validator so
prompt-injection-driven writes cannot reach writable catalogs.
- Gate /v1/tools/* behind a require_auth_configured middleware: when
runtime.auth is not set, these routes return 401 rather than invoking
tool.call anonymously with attacker-controlled bodies.
- Record the new mitigations in the v2.0.0 threat model.
* refactor: clarify read-only SQL validation comments and enhance documentation for DDL/DML restrictions
* Refactor authentication error response to use JSON format and add SQL tool descriptions for read-only and writable modes
* Fix collapsible_if clippy lint in read-only validation path
* Reject write-capable extension nodes in read-only validator
Spice's planner can represent DDL/DML as LogicalPlan::Extension nodes
(DdlExtensionNode, DmlExtensionNode, DistributedCayenne{Insert,Update,
Delete,Merge}Node, CayenneMergeNode). The previous read-only validator
only matched Ddl/Dml/Copy/Statement and would have let those plan shapes
through, defeating the read-only guarantee on /v1/tools/sql and /v1/nsql.
- Add Extension arm to validate_sql_query_read_only that denies any node
whose UserDefinedLogicalNodeCore::name matches a curated list of
write-capable extension names.
- Test the deny mechanism with a stub UserDefinedLogicalNode and verify
a non-write extension name is still allowed.
- Add an integration test that exercises Spice's create_logical_plan
wrapper end-to-end (cfg(not(windows))).
- Reflect the PREPARE/EXECUTE/DEALLOCATE rejection in the SqlTool
read-only description so LLM/tool-selection logic knows the posture.
- Replace the PR-contextual 'Unverified in this review' phrasing in the
threat model with the durable 'Unverified mitigation'.
* Bypass SQL results cache for read-only query paths
When ctx.read_only is set (e.g. the /v1/tools/sql read-only tool and the
/v1/nsql LLM SQL path), both the SQL-keyed and plan-keyed results-cache
lookups are now skipped inside get_plan_or_cached, and the returned
RequestCacheManager is forced to CacheDisabled. Previously, a cache hit
from a prior writable execution could short-circuit
validate_sql_query_read_only, letting a cached result produced by a
write-capable plan (e.g. LogicalPlan::Extension nodes like DmlExtension
or DistributedCayenneInsert) be served on a read-only surface.
Also move WRITE_CAPABLE_EXTENSION_NAMES into the cache crate as the
single source of truth, and extend cache_is_enabled_for_plan to reject
write-capable LogicalPlan::Extension nodes. Defense-in-depth: even on
writable paths, write-capable extension plans are now never cached or
populated in the results cache.
* fix: flatten write-capable extension check to match guard in cache eligibility
Removes one level of nesting as requested in review.
---------
Co-authored-by: Viktor Yershov <viktor@spice.ai>
0 commit comments