Skip to content

Commit 4cf789d

Browse files
feat: per-principal cache namespacing for SQL/search/caching-accelerator (spiceai#10680) (spiceai#10702)
* feat(runtime-request-context): add CacheNamespace + AuthPrincipal::stable_id (milestone 1 of spiceai#10680) Introduce the CacheNamespace enum (Public / Principal(id) / System) on RequestContext as the foundation for per-user scoped caching under U2M auth. - New CacheNamespace type with kind() (telemetry, low-cardinality) and as_header_value() (HTTP-vocabulary 'shared'/'user'/'system'). - AuthPrincipal gains stable_id() -> Option<Cow<str>> with default None. Real principals override; the default returns None which maps to the Public namespace (correct behavior for the existing Anonymous principal). - ApiKey::stable_id() returns 'apikey:<sha256(key)[..16]>'. The literal key value is never exposed; rotated keys produce a new id, which correctly treats them as a different principal. - RequestContext::cache_namespace() computes the namespace lazily from protocol + auth_principal, with an optional builder override (with_cache_namespace) for SWR background revalidation and similar flows that must inherit the originating user's namespace. - to_dimensions() now emits cache_namespace_kind (low-cardinality only; never includes the principal id). No call sites consume the namespace yet — this is purely additive infrastructure. Subsequent milestones wire it into the SQL results cache, search cache, response headers, caching accelerator, and cluster RPC propagation. Refs: spiceai#10680 * feat: scope SQL results + search caches by CacheNamespace; add Results-Cache-Scope header (spiceai#10680) Mix the request's CacheNamespace into every cache key for the SQL results cache, the plan cache (via the SQL-keyed plans cache key), and the search cache. Two requests in different namespaces now hash to distinct cache entries for otherwise-identical inputs, so cached results cannot leak between authenticated principals. cache::key::CacheKey - New as_raw_key_in_namespace(hasher, tag, id_bytes) folds a [tag][id_len LE u64][id_bytes] prefix into the hasher before the existing payload bytes, so (tag=1, id='abc') + payload starting with 'def' cannot collide with (tag=1, id='abcdef') + 'def'. - Existing as_raw_key() is unchanged and is used by the embeddings cache, which is intentionally globally shared because (model, input) → embedding is permission-independent. - hash_payload() is extracted so both methods walk identical bytes for the same payload. runtime-request-context::CacheNamespace - hash_inputs() returns (u8 tag, &[u8] id) for use with as_raw_key_in_namespace. Tags are stable across releases (0=Public, 1=Principal, 2=System) so cached entries survive upgrades. runtime/src/datafusion/query/cache.rs - get_plan_or_cached and try_get_cached_result derive the namespace from the request context and pass its hash inputs through to every key materialization (SQL key, client-supplied key, plan key). The plans cache (`get_or_create_logical_plan`) is fed namespace-mixed keys via sql_raw_cache_key, so plan caching is also per-namespace. runtime/src/search/search_engine.rs - search_with_cache mixes the namespace into the search cache key. HTTP response headers (runtime/src/http/v1/{mod,search}.rs) - New Results-Cache-Scope and Search-Results-Cache-Scope sibling headers next to the existing *-Cache-Status, with values 'shared' / 'user' / 'system'. No principal id ever appears in the header, so it is safe to log and dashboard. - When scope == 'user', append 'Authorization' to Vary so any HTTP cache between Spice and the client never collapses entries belonging to different principals. - Added a small append_vary helper that combines fields rather than overwriting (RFC 7231 §7.1.4), and switched the existing Spice-Cache-Key Vary to use it. Tests - runtime-request-context: hash_inputs_distinct_per_variant asserts tag mapping and that empty-id Public/System never collide. - runtime: test_results_cache_isolated_per_principal drives two distinct Principal namespaces through the full Query pipeline and asserts MISS on the second principal's first request, HIT on each principal's repeat, and MISS for an unauthenticated Public caller. This is the explicit cross-principal isolation guarantee. Refs: spiceai#10680 * feat(runtime): SWR background revalidation inherits originating cache namespace (spiceai#10680) trigger_background_query_revalidation now plumbs the originating request's CacheNamespace into create_background_context, which sets it as an explicit override on the new internal RequestContext. The background query therefore re-executes under the user's namespace, not under System. Why it matters: any cache lookup the background query performs while re-running (planner cache, search cache, the upcoming caching accelerator) must use the same namespace as the originating request. Without this propagation, sub-cache reads/writes during SWR refresh would land in a different scope and either leak cross-user (caching accelerator) or never serve the user who triggered the refresh. The result write itself is unaffected because cache_revalidation_result already writes under the original (already-namespace-mixed) cache key. New test test_swr_revalidation_inherits_originating_namespace exercises the full SWR lifecycle for two distinct principals: Alice populates, waits past TTL, sees STALE, the background refresh runs, Alice then sees a fresh HIT, and Bob \u2014 issuing the same SQL \u2014 still sees a MISS, proving the refresh did not bleed into another scope. Refs: spiceai#10680 * feat(runtime): reserve __spice_cache_namespace column for caching accelerator (m4 foundation, spiceai#10680) Introduces the foundation for per-namespace isolation inside the caching accelerator (refresh_mode: caching). No behavior change: nothing yet writes, reads, or stamps the column \u2014 this commit only defines the names, validation, and helpers later commits will use. - CACHE_NAMESPACE_COLUMN = "__spice_cache_namespace". The leading double-underscore matches the existing internal-name convention used elsewhere in Spice and makes accidental collision with real user columns very unlikely. - is_reserved_caching_column() for use by dataset config validation (case-insensitive so SQL-style identifiers also collide). - extend_schema_with_cache_namespace() returns the storage schema used by the caching accelerator: the source schema with one appended Utf8 NOT NULL column. NOT NULL is deliberate \u2014 a NULL on read indicates a corrupt or pre-isolation table and we want that to surface as an error rather than silently behaving as 'public'. Returns DataFusionError::Plan with both the dataset name and the reserved column name on collision, so users hitting the breaking change get a self-explanatory message instead of a schema mismatch deep in the accelerator. Tests cover case-insensitive reservation, append-and-shape of the extended schema, and the collision error message contents. Refs: spiceai#10680 * feat(runtime): extend caching accelerator storage schema with namespace column; hide from user-facing schema (m4b, spiceai#10680) Builds on the m4 foundation. In caching mode (refresh_mode: caching), the underlying accelerator table is now created with one extra column appended to the source schema: __spice_cache_namespace (Utf8 NOT NULL). The user-facing AcceleratedTable schema continues to expose only the original columns, so query planning, federation, and user projections are unaffected. This is the wiring step \u2014 nothing yet stamps or filters by the column; later commits add per-namespace read filtering and write stamping. datafusion/mod.rs::make_accelerated_table: - When refresh_mode == Caching, derive storage_schema from refresh_schema via extend_schema_with_cache_namespace and pass it to create_accelerator_table. The schema-extension helper returns a self-explanatory error if the source schema already declares the reserved column name. - Tell AcceleratedTableBuilder to expose refresh_schema (the original) as the user-facing schema via the new user_facing_schema setter. accelerated_table/mod.rs: - AcceleratedTable gains user_facing_schema: Option<SchemaRef>. schema() returns it when set, otherwise the storage schema (no behavior change for non-caching modes). - Builder::user_facing_schema(schema) setter; defaulted to None. Schema flow correctness: - Storage = user_facing + 1 appended column. All user-facing column indices remain valid in storage, so projections forward unchanged. - extend_projection_for_caching looks up CACHE_REFRESHED_AT_COLUMN by name, so its returned indices are valid in either schema. - SchemaCastScanExec at the top of the scan plan uses the user-facing target schema and projects by column name via try_cast_to, so the namespace column is naturally stripped from results. Breaking change: pre-existing caching accelerator storage from earlier Spice versions does not have __spice_cache_namespace and will fail to open with a schema-mismatch error from the underlying engine. Users must drop the on-disk store (e.g. delete duckdb_file path or DROP TABLE on SQLite/Postgres/Cayenne) and restart so Spice recreates the table. All 183 accelerated_table unit tests pass; clippy clean. Refs: spiceai#10680 * feat(runtime): namespace-scope caching accelerator + drop redundant read-only cache bypass (m4cd, spiceai#10680) This commit lands two changes that turn out to be inseparable in practice. M4cd makes the caching accelerator per-principal isolated; once that holds, the read-only short-circuit in the SQL results cache becomes both unnecessary and actively harmful (it forced read-only API keys to skip the cache they could safely reuse). ------------------------------------------------------------ Part 1 \u2014 caching-accelerator namespace scoping (m4cd) ------------------------------------------------------------ Builds on m4b (storage column) by actually using the column for isolation. Combined with m4b, this is the safety-meaningful change: two principals running the same caching-accelerated query against the same dataset are now backed by disjoint cached rows, both at read time and write time, including SWR background refresh. CacheNamespace::storage_id (runtime-request-context): - New stable-string accessor that maps each variant to its on-disk tag: "public", "system", or the principal's opaque id verbatim. These values are persisted in __spice_cache_namespace and must survive upgrades. caching.rs (runtime/accelerated_table): - stamp_namespace_column(batch, ns_id): appends a constant-string Utf8 NOT NULL column populated with ns_id for every row. Idempotent if the column is already present. - namespace_filter_expr(ns_id): builds the equality predicate pushed onto every accelerator scan in caching mode. - CacheWriteRequest gains namespace_id: Arc<str>. Send-sites (handle_cache_miss, handle_cache_hit's SWR refresh, refresh_entry) record the originating namespace; the flush task is the one that stamps the column and augments upsert filters. - flush_cache_writes inspects the accelerator's actual schema. If __spice_cache_namespace is present (real deployments per m4b), every batch is stamped and every upsert filter set gets a namespace equality predicate appended. If not (test mocks), behavior is unchanged so caching unit tests can keep their minimal schemas. - handle_cache_miss / handle_cache_hit / refresh_entry now take a CacheNamespace parameter, plumbed from a single capture at the top of CachingAccelerationScanExec::execute. AcceleratedTable::scan (caching mode): - Augments the filters passed to accelerator.scan with __spice_cache_namespace = $current_ns. The federated source still receives only the user's original filters (it does not have the column). - Strict re-application via FilterExec on top of the accelerator scan output. Filters passed to TableProvider::scan are an *optimization hint* in DataFusion; an accelerator that returns Inexact / Unsupported (or even one that returns Exact and silently does not push down) would let rows from other namespaces leak through. End-to-end testing with DuckDB confirmed this happens in practice without strict re-application. - extend_projection_for_caching now extends the storage projection to include __spice_cache_namespace whenever the accelerator schema has the column, so the FilterExec on top can resolve it even when the user's SELECT does not name it. SchemaCastScanExec strips both fetched_at and __spice_cache_namespace from the user-facing output by name. Request-context propagation (the subtle bit): - DataFusion does not propagate Tokio task-locals across the TableProvider::scan await point or into ExecutionPlan::execute on worker threads. Reading the cache namespace via RequestContext::current() in those contexts silently falls back to the global INTERNAL_REQUEST_CONTEXT (Protocol::Internal, no principal), collapsing every caller to CacheNamespace::System and defeating isolation. End-to-end testing caught this. - Both AcceleratedTable::scan and CachingAccelerationScanExec::execute now read the request context from the SessionConfig / TaskContext extension where Query::run_internal attaches it, matching the pattern used by BytesProcessedExec. Verified end-to-end with two API keys against an HTTP source dataset (refresh_mode: caching, engine: duckdb file): each principal triggers exactly one upstream fetch for a cold scan, repeats serve from their own cached row, and the other principal's repeat does not inherit. Inspecting the underlying DuckDB shows two rows stamped with two distinct apikey:<sha> namespaces. ------------------------------------------------------------ Part 2 \u2014 drop the read-only short-circuit in SQL results cache ------------------------------------------------------------ Previously get_plan_or_cached treated read_only=true as a hard "skip cache lookup AND skip cache populate" signal. That was a workaround for two hazards both now handled at the right layer: 1. Cross-principal leakage. A write-capable caller's cached output could in theory be served back to a read-only caller on a later identical query. With cache keys namespaced by CacheNamespace (this PR), cross-principal hits are structurally impossible \u2014 a read-only caller can only ever see entries it (or another caller in the same namespace) populated. 2. Cache-served write-capable plans bypassing read-only validation. QueryResultsCacheProvider::cache_is_enabled_for_plan refuses to cache DDL/DML/Copy/Statement and every LogicalPlan::Extension whose name appears in WRITE_CAPABLE_EXTENSION_NAMES (currently DdlExtension and DmlExtension). The cache itself therefore never stores a write-capable plan; nothing for read-only validation to bypass on the read side. The old doc comment cited DistributedCayenneInsert as a gap, but that name is a *physical* exec node \u2014 its logical form is DmlExtension, already covered. Removing the short-circuit means read-only API keys (the default when no :rw suffix is given) now get cache hits for repeated queries, the same as RW keys, while remaining isolated from other principals. Verified manually with two bare API keys against /v1/sql: alice repeat -> HIT bob (same SQL) -> MISS (not alice's entry) bob repeat -> HIT alice again -> HIT (still her entry, untouched by bob) Changes: - get_plan_or_cached drops the read_only parameter and the three if read_only branches (two lookup skips + one populate skip). - Doc comment rewritten to state the two invariants the cache now relies on. - Two callers in datafusion/query.rs updated; the read_only flag is still threaded into QueryBuilder for validate_sql_query_read_only, which is unaffected. Refs: spiceai#10680 * test(runtime): integration coverage for per-principal caching-accelerator isolation (spiceai#10680) End-to-end regression gate for the namespace-scoping work in m4cd. Two tokio integration tests sit in tests/acceleration/, gated on the `duckdb` feature, exercising the full pipeline: HTTP source (axum mock with hit counter) -> caching-mode acceleration (refresh_mode: caching) -> DuckDB file accelerator (tempdir) -> Query::run_internal (via query_builder) -> AcceleratedTable::scan -> CachingAccelerationScanExec::execute (worker thread) The caching accelerator unit tests use minimal mock storage and never go through the planner / TaskContext, so they cannot reach the two silent-failure modes that were caught only by manual end-to-end testing during m4cd: 1. DataFusion treats filters passed to TableProvider::scan as optimization hints; an accelerator that does not push down would leak rows from another principal's namespace. 2. DataFusion does not propagate Tokio task-locals through TableProvider::scan or ExecutionPlan::execute, so reading the namespace via RequestContext::current() collapses every caller to CacheNamespace::System and silently breaks isolation. Either regression would silently turn the cross-principal scenario into a data leak rather than a test failure. Verified by injecting each regression in turn before landing this test: - Removing the FilterExec re-application: passes (DuckDB pushes the predicate down hard), but documents the behavior; the protection remains for accelerators with weaker pushdown. - Restoring RequestContext::current() in CachingAccelerationScanExec::execute: FAILS as expected ("alice repeat must NOT fetch upstream (saw 2 fetches; expected 1)"), proving the test gates the more dangerous of the two regressions. Tests added: - caching_accelerator_isolates_per_principal_e2e Walks alice -> alice (cache hit) -> bob (cross-principal cold fetch, NOT alice's row) -> bob (cache hit) -> alice (still her own row, untouched by bob). Asserts upstream fetch counts at every step plus body inequality across principals. - caching_accelerator_hides_namespace_column_from_user_schema Confirms the user-facing schema does not expose __spice_cache_namespace, and that referencing it in SQL fails with a normal "no field" error rather than silently resolving against the hidden storage column. Implementation notes: - Uses ApiKey::ReadOnly principals plumbed through RequestContext::set_auth_principal + ctx.scope() so the same query string under different principals produces different CacheNamespace values (apikey:<sha256[..16]>). - Goes through query_builder().build().run() rather than the raw DataFrame API so Query::run_internal attaches the per-call RequestContext to the SessionConfig. The DataFrame path bypasses this and the scan would silently fall back to System ns, collapsing both principals into one cache scope. - Sleeps 1s between scans (>= 2x CACHE_WRITE_FLUSH_INTERVAL_MS) so the async batched cache flush is guaranteed to land before the next scan looks for the row. Refs: spiceai#10680
1 parent b6c0662 commit 4cf789d

17 files changed

Lines changed: 1750 additions & 136 deletions

File tree

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/cache/src/key.rs

Lines changed: 51 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -86,23 +86,25 @@ impl<'a> From<(&'a str, &'a EmbeddingInput)> for CacheKey<'a> {
8686
}
8787

8888
impl CacheKey<'_> {
89-
#[must_use]
90-
pub fn as_raw_key<T: Hasher>(&self, mut hasher: T) -> RawCacheKey {
89+
/// Hash this key's payload into `hasher`. Used by both [`Self::as_raw_key`]
90+
/// and [`Self::as_raw_key_in_namespace`] so the payload byte stream stays
91+
/// identical regardless of whether a namespace prefix was mixed in.
92+
fn hash_payload<T: Hasher>(&self, hasher: &mut T) {
9193
match self {
92-
Self::LogicalPlan(logical_plan) => logical_plan.hash(&mut hasher),
93-
Self::Search(search_key) => search_key.hash(&mut hasher),
94-
Self::EmbeddingRequest(embedding_request) => embedding_request.hash(&mut hasher),
94+
Self::LogicalPlan(logical_plan) => logical_plan.hash(hasher),
95+
Self::Search(search_key) => search_key.hash(hasher),
96+
Self::EmbeddingRequest(embedding_request) => embedding_request.hash(hasher),
9597
Self::EmbeddingInput(model_name, embedding_input) => {
96-
model_name.hash(&mut hasher);
97-
embedding_input.hash(&mut hasher);
98+
model_name.hash(hasher);
99+
embedding_input.hash(hasher);
98100
}
99101
Self::Query(sql, param_values) => {
100-
sql.hash(&mut hasher);
102+
sql.hash(hasher);
101103
if let Some(params) = param_values {
102104
match params {
103105
ParamValues::List(vec) => {
104106
for item in vec {
105-
item.value().hash(&mut hasher);
107+
item.value().hash(hasher);
106108
}
107109
}
108110
ParamValues::Map(hash_map) => {
@@ -111,15 +113,52 @@ impl CacheKey<'_> {
111113
pairs.sort_by(|a, b| a.0.cmp(b.0)); // Sort by keys
112114

113115
for (key, value) in pairs {
114-
key.hash(&mut hasher);
115-
value.value().hash(&mut hasher);
116+
key.hash(hasher);
117+
value.value().hash(hasher);
116118
}
117119
}
118120
}
119121
}
120122
}
121-
Self::ClientSupplied(user_key) => user_key.hash(&mut hasher),
123+
Self::ClientSupplied(user_key) => user_key.hash(hasher),
122124
}
125+
}
126+
127+
/// Compute the raw cache key with no namespace mixed in. Use this for
128+
/// surfaces whose result is a pure function of the inputs and is safe
129+
/// to share across all callers — most importantly the embeddings cache,
130+
/// where `(model, input) -> embedding` is permission-independent.
131+
#[must_use]
132+
pub fn as_raw_key<T: Hasher>(&self, mut hasher: T) -> RawCacheKey {
133+
self.hash_payload(&mut hasher);
134+
RawCacheKey(hasher.finish())
135+
}
136+
137+
/// Compute the raw cache key with a namespace prefix folded into the
138+
/// hash. Two requests whose `(namespace_tag, namespace_id)` differ
139+
/// hash to distinct keys for otherwise-identical payloads, which is
140+
/// what makes cache hits safe under per-user authentication.
141+
///
142+
/// `namespace_tag` is the discriminant of the namespace kind
143+
/// (`0` = public/shared, `1` = principal, `2` = system) and
144+
/// `namespace_id` is the principal's stable opaque id (empty for
145+
/// public/system).
146+
///
147+
/// The byte stream hashed is:
148+
/// `[namespace_tag][namespace_id.len() as u64 LE][namespace_id...][payload...]`
149+
/// so that `(tag=1, id="abc")` and `(tag=1, id="a")` followed by a
150+
/// payload starting with `"bc"` cannot collide.
151+
#[must_use]
152+
pub fn as_raw_key_in_namespace<T: Hasher>(
153+
&self,
154+
mut hasher: T,
155+
namespace_tag: u8,
156+
namespace_id: &[u8],
157+
) -> RawCacheKey {
158+
hasher.write_u8(namespace_tag);
159+
hasher.write_u64(namespace_id.len() as u64);
160+
hasher.write(namespace_id);
161+
self.hash_payload(&mut hasher);
123162
RawCacheKey(hasher.finish())
124163
}
125164
}

crates/runtime-auth/Cargo.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,4 +17,5 @@ tracing.workspace = true
1717
futures.workspace = true
1818
base64.workspace = true
1919
pin-project.workspace = true
20-
http.workspace = true
20+
http.workspace = true
21+
sha2.workspace = true

crates/runtime-auth/src/traits.rs

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,17 +14,36 @@ See the License for the specific language governing permissions and
1414
limitations under the License.
1515
*/
1616

17+
use std::borrow::Cow;
1718
use std::sync::Arc;
1819

1920
use crate::error::Error;
2021
use app::spicepod::component::runtime::ApiKey;
2122
use axum::http;
23+
use sha2::{Digest, Sha256};
2224

2325
pub type AuthPrincipalRef = Arc<dyn AuthPrincipal + Sync + Send>;
2426

2527
pub trait AuthPrincipal {
2628
fn username(&self) -> &str; // The username as presented during auth
2729
fn groups(&self) -> &[&str]; // Group memberships
30+
31+
/// A stable, opaque identifier for this principal, suitable for use as
32+
/// a cache namespace key. Returning `None` means the principal is
33+
/// effectively anonymous and should share the public cache scope.
34+
///
35+
/// The id MUST be:
36+
/// - **stable** across credential rotation for the same identity (so
37+
/// rotating an OIDC token does not invalidate the principal's cache);
38+
/// - **opaque** — not the bearer token or API key value itself; and
39+
/// - prefixed with the auth scheme (e.g. `apikey:`, `u2m:`) so two
40+
/// schemes that happen to mint the same id never collide.
41+
///
42+
/// The default implementation returns `None`, which is the correct
43+
/// behavior for anonymous principals. Real principals must override.
44+
fn stable_id(&self) -> Option<Cow<'_, str>> {
45+
None
46+
}
2847
}
2948
pub trait AuthRequestContext {
3049
/// Sets the current authentication principal for the request context.
@@ -59,8 +78,30 @@ impl AuthPrincipal for ApiKey {
5978
ApiKey::ReadWrite { .. } => &["read_write"],
6079
}
6180
}
81+
82+
/// Stable opaque id derived from the SHA-256 of the API key material,
83+
/// truncated to 16 bytes (32 hex chars). The key value itself is
84+
/// never exposed; rotating the key produces a new id (which is the
85+
/// desired behavior — rotated keys are a different principal).
86+
///
87+
/// Format: `apikey:<32-hex-chars>`.
88+
fn stable_id(&self) -> Option<Cow<'_, str>> {
89+
let key_bytes = match self {
90+
ApiKey::ReadOnly { key } | ApiKey::ReadWrite { key } => key.as_bytes(),
91+
};
92+
let digest = Sha256::digest(key_bytes);
93+
let mut id = String::with_capacity("apikey:".len() + 32);
94+
id.push_str("apikey:");
95+
for byte in &digest[..16] {
96+
id.push(HEX[usize::from(byte >> 4)] as char);
97+
id.push(HEX[usize::from(byte & 0x0f)] as char);
98+
}
99+
Some(Cow::Owned(id))
100+
}
62101
}
63102

103+
const HEX: &[u8; 16] = b"0123456789abcdef";
104+
64105
pub trait HttpAuth {
65106
/// Receive the entire HTTP request object and return a verdict on whether to allow/deny it
66107
///
Lines changed: 197 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,197 @@
1+
/*
2+
Copyright 2024-2026 The Spice.ai OSS Authors
3+
4+
Licensed under the Apache License, Version 2.0 (the "License");
5+
you may not use this file except in compliance with the License.
6+
You may obtain a copy of the License at
7+
8+
https://www.apache.org/licenses/LICENSE-2.0
9+
10+
Unless required by applicable law or agreed to in writing, software
11+
distributed under the License is distributed on an "AS IS" BASIS,
12+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
See the License for the specific language governing permissions and
14+
limitations under the License.
15+
*/
16+
17+
//! Cache namespace for per-user (and per-system) cache scoping.
18+
//!
19+
//! Every cache surface in the runtime — SQL results, search, plan cache,
20+
//! HTTP-connector caching accelerator — mixes the request's [`CacheNamespace`]
21+
//! into its hash input. Two requests with different namespaces hash to
22+
//! different cache keys for otherwise-identical inputs, which is what makes
23+
//! cached results safe to serve under user-to-machine authentication.
24+
//!
25+
//! Scope is automatic and not user-configurable:
26+
//! - No authenticated principal (or an anonymous principal) → [`CacheNamespace::Public`].
27+
//! - Internal/system traffic (refresh tasks, cluster RPCs, SWR revalidation,
28+
//! health checks) → [`CacheNamespace::System`].
29+
//! - Authenticated principal → [`CacheNamespace::Principal`] keyed on the
30+
//! principal's stable opaque id.
31+
//!
32+
//! See `plans/per-user-cache-option-1-namespace.md` and issue #10680 for the
33+
//! full design.
34+
35+
use std::hash::Hash;
36+
use std::sync::Arc;
37+
38+
/// The cache scope a request is executing under.
39+
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
40+
pub enum CacheNamespace {
41+
/// No authenticated principal; cached entries may be shared across all
42+
/// unauthenticated callers. This is the default for unauthenticated
43+
/// deployments and matches today's pre-isolation behavior.
44+
Public,
45+
/// Cached entries are scoped to a single authenticated principal.
46+
/// The inner string is a stable opaque identifier produced by the
47+
/// principal's `stable_id()`. It is **not** the bearer token or API
48+
/// key value; rotating the credential without changing the underlying
49+
/// identity must not invalidate the cache.
50+
Principal(Arc<str>),
51+
/// Internal / system traffic: refresh tasks, cluster RPCs, SWR
52+
/// background revalidation, health probes. Entries written under
53+
/// `System` are not visible to user requests.
54+
System,
55+
}
56+
57+
impl CacheNamespace {
58+
/// A short, low-cardinality string suitable for telemetry dimensions.
59+
/// **Must never** include the principal id, which is high-cardinality
60+
/// and (depending on auth method) sensitive.
61+
#[must_use]
62+
pub fn kind(&self) -> &'static str {
63+
match self {
64+
CacheNamespace::Public => "public",
65+
CacheNamespace::Principal(_) => "principal",
66+
CacheNamespace::System => "system",
67+
}
68+
}
69+
70+
/// A short, lower-case label suitable for the `Results-Cache-Scope` /
71+
/// `Search-Results-Cache-Scope` HTTP response headers. Uses HTTP/CDN
72+
/// vocabulary (`shared` / `user` / `system`) rather than the internal
73+
/// enum names so it matches operator expectations from `Cache-Control:
74+
/// private` and similar.
75+
#[must_use]
76+
pub fn as_header_value(&self) -> &'static str {
77+
match self {
78+
CacheNamespace::Public => "shared",
79+
CacheNamespace::Principal(_) => "user",
80+
CacheNamespace::System => "system",
81+
}
82+
}
83+
84+
/// Inputs to be folded into a cache key hash so two distinct namespaces
85+
/// produce distinct cache keys for otherwise-identical payloads.
86+
///
87+
/// Returns `(tag, id_bytes)` where `tag` is the namespace discriminant
88+
/// (kept stable across releases so cached entries survive upgrades):
89+
/// - `0` → [`CacheNamespace::Public`] (no id)
90+
/// - `1` → [`CacheNamespace::Principal`] (`id_bytes` is the principal's
91+
/// stable opaque id)
92+
/// - `2` → [`CacheNamespace::System`] (no id)
93+
///
94+
/// Pair with `cache::key::CacheKey::as_raw_key_in_namespace`.
95+
#[must_use]
96+
pub fn hash_inputs(&self) -> (u8, &[u8]) {
97+
match self {
98+
CacheNamespace::Public => (0, &[]),
99+
CacheNamespace::Principal(id) => (1, id.as_bytes()),
100+
CacheNamespace::System => (2, &[]),
101+
}
102+
}
103+
104+
/// Stable string identifier suitable for storage inside the caching
105+
/// accelerator's `__spice_cache_namespace` column. Comparing rows by
106+
/// this value is what enforces cross-principal isolation in the
107+
/// accelerator.
108+
///
109+
/// Format:
110+
/// - [`CacheNamespace::Public`] → `"public"`
111+
/// - [`CacheNamespace::System`] → `"system"`
112+
/// - [`CacheNamespace::Principal`] → the principal's opaque id
113+
/// verbatim (e.g. `"apikey:0123456789abcdef"`)
114+
///
115+
/// The values are stable across releases; persisted cache rows must
116+
/// continue to compare equal after upgrades.
117+
#[must_use]
118+
pub fn storage_id(&self) -> &str {
119+
match self {
120+
CacheNamespace::Public => "public",
121+
CacheNamespace::Principal(id) => id.as_ref(),
122+
CacheNamespace::System => "system",
123+
}
124+
}
125+
}
126+
127+
#[cfg(test)]
128+
mod tests {
129+
use super::*;
130+
131+
#[test]
132+
fn kind_is_low_cardinality_and_excludes_id() {
133+
assert_eq!(CacheNamespace::Public.kind(), "public");
134+
assert_eq!(CacheNamespace::System.kind(), "system");
135+
let with_id = CacheNamespace::Principal(Arc::from("apikey:0123456789abcdef"));
136+
assert_eq!(with_id.kind(), "principal");
137+
// The id must not appear anywhere in the dimension value.
138+
assert!(!with_id.kind().contains("0123456789abcdef"));
139+
}
140+
141+
#[test]
142+
fn header_value_uses_http_vocabulary() {
143+
assert_eq!(CacheNamespace::Public.as_header_value(), "shared");
144+
assert_eq!(
145+
CacheNamespace::Principal(Arc::from("apikey:abc")).as_header_value(),
146+
"user"
147+
);
148+
assert_eq!(CacheNamespace::System.as_header_value(), "system");
149+
}
150+
151+
#[test]
152+
fn equal_principal_ids_are_equal_namespaces() {
153+
let a = CacheNamespace::Principal(Arc::from("u2m:sub:42"));
154+
let b = CacheNamespace::Principal(Arc::from("u2m:sub:42"));
155+
assert_eq!(a, b);
156+
}
157+
158+
#[test]
159+
fn different_principal_ids_are_distinct() {
160+
let a = CacheNamespace::Principal(Arc::from("u2m:sub:42"));
161+
let b = CacheNamespace::Principal(Arc::from("u2m:sub:43"));
162+
assert_ne!(a, b);
163+
}
164+
165+
#[test]
166+
fn hash_inputs_distinct_per_variant() {
167+
let pub_ns = CacheNamespace::Public;
168+
let sys_ns = CacheNamespace::System;
169+
let principal_ns = CacheNamespace::Principal(Arc::from("apikey:abc"));
170+
let pub_inputs = pub_ns.hash_inputs();
171+
let sys_inputs = sys_ns.hash_inputs();
172+
let p_inputs = principal_ns.hash_inputs();
173+
174+
assert_eq!(pub_inputs, (0, &[][..]));
175+
assert_eq!(sys_inputs, (2, &[][..]));
176+
assert_eq!(p_inputs.0, 1);
177+
assert_eq!(p_inputs.1, b"apikey:abc");
178+
// Tags must be distinct so empty-id Public and System never collide.
179+
assert_ne!(pub_inputs.0, sys_inputs.0);
180+
}
181+
182+
#[test]
183+
fn storage_id_is_stable_per_variant() {
184+
assert_eq!(CacheNamespace::Public.storage_id(), "public");
185+
assert_eq!(CacheNamespace::System.storage_id(), "system");
186+
assert_eq!(
187+
CacheNamespace::Principal(Arc::from("apikey:abc")).storage_id(),
188+
"apikey:abc",
189+
);
190+
// public and system are distinct so an unauthenticated user and a
191+
// background system task cannot collide on cached rows.
192+
assert_ne!(
193+
CacheNamespace::Public.storage_id(),
194+
CacheNamespace::System.storage_id(),
195+
);
196+
}
197+
}

0 commit comments

Comments
 (0)