docs/website/docs/features/caching/index.md at 08dd8e03def125b2bf6358280aa94ce0af758d81 · spiceai/docs

title

Caching

sidebar_label

Caching

description

Learn how to use Spice in-memory caching

sidebar_position

pagination_prev

pagination_next

`caching` Parameters

Parameter name	Optional	Description
`sql_results`	Yes	Enabled by default. Configures the Runtime cache for results from SQL queries. See the SQL Results Parameters for cache parameter details.
`search_results`	Yes	Enabled by default. Configures the Runtime cache for results from searches. See the Common Caching Parameters for cache parameter details.
`embeddings`	Yes	Enabled by default. Configures the Runtime cache for embeddings requests. See the Common Caching Parameters for cache parameter details.

Common Caching Parameters

Every cache type (sql_results, search_results, embeddings) supports the following parameters:

Parameter name	Optional	Default	Description
`enabled`	Yes	`true`	Defaults to `true`.
`max_size`	Yes	`128MiB`	Maximum cache size. Defaults to `128MiB`.
`eviction_policy`	Yes	`lru`	Cache replacement policy when the cache reaches `max_size`. Defaults to `lru`. Supports `lru` (Least Recently Used) and `tiny_lfu` (Tiny Least Frequently Used, higher hit rate for skewed access patterns).
`item_ttl`	Yes	`1s`	Cache entry expiration duration (Time to Live). Defaults to 1 second.
`hashing_algorithm`	Yes	`xxh3`	Selects which hashing algorithm is used to hash the cache keys when storing the results. Defaults to `xxh3`. Supports `xxh3`, `ahash`, `siphash`, `blake3`, `xxh32`, `xxh64`, or `xxh128`.

`caching.sql_results` Parameters

In addition to the common caching parameters, sql_results also supports additional parameters:

Parameter name	Optional	Default	Description
`cache_key_type`	Yes	`plan`	Determines how cache keys are generated. Defaults to `plan`. `plan` uses the query's logical plan, while `sql` uses the raw SQL query string.
`encoding`	Yes	`none`	Compression algorithm for cached results. Defaults to `none`. Supports `none` or `zstd`.
`stale_while_revalidate_ttl`	Yes	`0s`	Duration to serve stale cache entries while revalidating in the background. When set to a non-zero value, expired cache entries continue to be served while a background refresh occurs. Defaults to `0s` (disabled).

Choosing a `cache_key_type`

plan (Default): Uses the query's logical plan as the cache key. This approach matches semantically equivalent queries, even if their SQL syntax differs. However, it requires query parsing, which introduces some overhead.
sql: Uses the raw SQL string as the cache key. This method provides faster lookups but requires exact string matches. Queries with dynamic functions, such as NOW(), may produce unexpected results because the cache key changes with each execution. Use sql only when query results are predictable and consistent.

Use sql for the lowest latency with identical queries that do not include dynamic functions. Use plan for greater flexibility and semantic matching of queries.

Choosing a `hashing_algorithm`

The hashing algorithm determines how cache keys are hashed before being stored, impacting both lookup speed and protection against potential DOS attacks.

xxh3 (Default): Uses the XXH3 algorithm for hashing the cache keys. XXH3 is a fast, non-cryptographic hash algorithm that provides high performance and good distribution. It is suitable for scenarios where speed is critical and cryptographic security is not required.
siphash: Uses the SipHash1-3 algorithm for hashing the cache keys, the default hashing algorithm of Rust. This hashing algorithm is a secure algorithm that implements verified protections against "hash flooding" denial of service (DoS) attacks. Reasonably performant, and provides a high level of security.
ahash: Uses the AHash algorithm for hashing the cache keys. The AHash algorithm is a high quality hashing algorithm, and has claimed resistance against hashing DoS attacks. AHash has higher performance than SipHash1-3, especially when used with cache_key_type: plan.
blake3: Uses the BLAKE3 cryptographic hash function. BLAKE3 is a fast, parallelizable hash function that provides cryptographic security while maintaining high performance. It is suitable for scenarios requiring both speed and cryptographic guarantees.
xxh32, xxh64, xxh128: Variants of the XXH hashing algorithm with different output sizes. These algorithms offer a balance between speed and collision resistance, with larger hash sizes providing better collision resistance at the cost of performance.

Use xxh3 (the default) for its superior speed in most scenarios. Use ahash, xxh64 or xxh128 for reduced collision probability when caching a large number of queries. Use blake3 when cryptographic security is required. Use siphash when protection against hash flooding attacks is a priority.

Choosing an `encoding`

The encoding algorithm determines how cached results are compressed in memory, trading CPU for memory efficiency. Currently supported for SQL results only.

none (Default): Stores query results uncompressed. Uses more memory but has zero compression overhead. Best for small result sets or when memory is not a constraint.
zstd: Uses the Zstandard compression algorithm to compress cached query results. Provides high compression ratios (often 50-90% reduction) with fast decompression speeds. Recommended when caching large result sets to maximize cache capacity.

Use zstd when maximizing cache efficiency is important, especially for large queries that would otherwise quickly fill the cache. Use none for the lowest latency when memory is not constrained.

Per-Principal Cache Isolation

When authentication is enabled, all cache layers (SQL results, search results, and caching-mode acceleration storage) are automatically scoped per principal. Each authenticated caller has an isolated cache namespace — one caller's cached output is never served to a different caller.

Scenario	Cache namespace
Auth disabled or anonymous request	`public` — all callers share the cache
Authenticated principal (e.g., API key)	Per-principal — isolated by a hash of the principal's identity
Background / system tasks (e.g., SWR revalidation)	Inherits the originating principal's namespace

There are no user-facing knobs to disable isolation. Scope follows authentication presence by construction.

:::warning[Breaking change for caching accelerator] The caching accelerator storage schema gains an internal __spice_cache_namespace column. Existing accelerator storage from earlier versions must be deleted (e.g., remove the duckdb_file, drop the backing table) before upgrading. The runtime errors with a clear message at startup if it encounters a non-extended schema.

The SQL results cache and search cache are in-memory and require no migration. :::

Cached Responses

Responses from HTTP APIs include headers that indicate the cache status and scope:

Cache	Status Header	Scope Header
`sql_results`	`Results-Cache-Status`	`Results-Cache-Scope`
`search_results`	`Search-Results-Cache-Status`	`Search-Results-Cache-Scope`

The status header indicates the cache status:

Header value	Description
`HIT`	The query result was served from the cache.
`MISS`	The cache was checked, but the result was not found.
`BYPASS`	The cache was bypassed for this query (e.g., when `cache-control: no-cache` is specified).
`STALE`	A stale cache entry was served while the cache is being revalidated in the background (when `stale_while_revalidate_ttl` is configured).
header not present	The cache did not apply to this query (e.g., when caching is disabled or querying a system table).

The scope header indicates the cache namespace:

Scope value	Description
`shared`	The cache entry is in the public namespace (auth disabled or anonymous).
`user`	The cache entry is scoped to the authenticated principal. When the scope is `user`, the response also includes `Vary: Authorization`.
`system`	The cache entry is in the system/background namespace.

Examples

Cached Response

$ curl -XPOST -i http://localhost:8090/v1/sql -d 'select * from taxi_trips limit 1;'
HTTP/1.1 200 OK
content-type: text/plain; charset=utf-8
results-cache-status: HIT
results-cache-scope: shared
vary: origin, access-control-request-method, access-control-request-headers
content-length: 416
date: Thu, 13 Feb 2025 03:05:39 GMT

Uncached Response

$ curl -XPOST -i http://localhost:8090/v1/sql -d 'select * from taxi_trips limit 1;'
HTTP/1.1 200 OK
content-type: text/plain; charset=utf-8
results-cache-status: MISS
results-cache-scope: shared
vary: origin, access-control-request-method, access-control-request-headers
content-length: 416
date: Thu, 13 Feb 2025 03:13:19 GMT

Bypassed Cache with `cache-control: no-cache`

$ curl -H "cache-control: no-cache" -XPOST -i http://localhost:8090/v1/sql -d 'select * from taxi_trips limit 1;'
HTTP/1.1 200 OK
content-type: text/plain; charset=utf-8
results-cache-status: BYPASS
vary: origin, access-control-request-method, access-control-request-headers
content-length: 416
date: Thu, 13 Feb 2025 03:14:00 GMT

Stale Cache Response (Stale-While-Revalidate)

$ curl -XPOST -i http://localhost:8090/v1/sql -d 'select * from taxi_trips limit 1;'
HTTP/1.1 200 OK
content-type: text/plain; charset=utf-8
results-cache-status: STALE
vary: origin, access-control-request-method, access-control-request-headers
content-length: 416
date: Thu, 13 Feb 2025 03:15:30 GMT

Cache Control

You can control caching behavior for specific requests using HTTP headers. The Cache-Control header helps skip the cache for a request while caching the results for subsequent requests.

Stale-While-Revalidate

The stale_while_revalidate_ttl parameter configures a grace period during which stale cache entries continue to be served while a background refresh occurs. This technique reduces latency for end users by serving cached data immediately, even after item_ttl expires, while the system fetches fresh data asynchronously.

When stale_while_revalidate_ttl is set to a non-zero value:

Cache entries are served normally until item_ttl expires.
After item_ttl expires but before item_ttl + stale_while_revalidate_ttl expires, the stale entry is served immediately with a STALE cache status.
Simultaneously, a background task refreshes the cache entry.
Once the background refresh completes, subsequent requests receive the fresh data with a HIT cache status.
After item_ttl + stale_while_revalidate_ttl expires, the entry is evicted and the next request results in a MISS.

Example Configuration

runtime:
  caching:
    sql_results:
      enabled: true
      item_ttl: 10s
      stale_while_revalidate_ttl: 10s

With this configuration:

Fresh cache entries are served for 10 seconds after creation.
Between 10-20 seconds after creation, stale entries are served while being refreshed in the background.
After 20 seconds, the entry is evicted if not refreshed.

This approach is particularly useful for queries that take significant time to execute, providing a better user experience by reducing perceived latency while keeping data reasonably fresh.

:::warning[Conflict with Caching Accelerator SWR] When using a dataset with refresh_mode: caching, you cannot configure both the results cache's stale_while_revalidate_ttl and the caching accelerator's caching_stale_while_revalidate_ttl for the same dataset. These parameters control similar behavior at different layers.

Choose one approach:

Results cache SWR: Configure runtime.caching.sql_results.stale_while_revalidate_ttl for SQL query results caching
Caching accelerator SWR: Configure acceleration.params.caching_stale_while_revalidate_ttl for HTTP-based dataset caching

:::

HTTP/Flight API

The following endpoints support the standard HTTP Cache-Control header:

SQL query (HTTP and Arrow Flight)
Search (HTTP)

The following Cache-Control directives are supported:

Directive	Description
`no-cache`	Skips the cache for the current request but caches the results for future requests.
`min-fresh`	Specifies the minimum time (in seconds) that a cached response must remain fresh. For example, `min-fresh=60` requires the cached entry to be fresh for at least 60 more seconds.
`max-stale`	Indicates the client will accept a stale response. An optional value in seconds specifies the maximum staleness allowed. For example, `max-stale=30` accepts responses stale for up to 30 seconds.
`only-if-cached`	Returns only cached responses. If no cached response is available, returns an error instead of fetching fresh data.
`stale-if-error`	Serves stale cached responses if an error occurs while fetching fresh data. An optional value in seconds specifies how stale the response can be. For example, `stale-if-error=600` serves responses stale for up to 10 minutes if fetching fails.

HTTP Example

# Default behavior (uses cache)
curl -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Skip cache for this query, but cache the results for future queries
curl -H "cache-control: no-cache" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Only use cached response if it will be fresh for at least 30 more seconds
curl -H "cache-control: min-fresh=30" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Accept cached responses that are stale for up to 60 seconds
curl -H "cache-control: max-stale=60" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Only return cached responses, fail if cache miss
curl -H "cache-control: only-if-cached" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

# Serve stale cache (up to 300 seconds old) if fetching fresh data fails
curl -H "cache-control: stale-if-error=300" -XPOST http://localhost:8090/v1/sql -d 'SELECT 1'

Arrow FlightSQL Example

The following example skips the cache for a specific query using FlightSQL in Rust:

let sql_command = arrow_flight::sql::CommandStatementQuery {
    query: "SELECT 1".to_string(),
    transaction_id: None,
};
let sql_command_bytes = sql_command.as_any().encode_to_vec();

let mut request = FlightDescriptor::new_cmd(sql_command_bytes).into_request();

request
  .metadata_mut()
  .insert("cache-control", "no-cache");

// Send the request

The cache can be controlled using JDBC properties. For example,

Properties props = new Properties();
props.setProperty("cache-control", "no-cache");
Connection conn = DriverManager.getConnection("jdbc:arrow-flight-sql://localhost:50051", props);

`spice` CLI

The spice sql and spice search commands accept a --cache-control flag that supports all cache-control directives:

# Default behavior (use cache if available)
spice sql
# Same as above
spice sql --cache-control cache
# Skip cache for this query, but cache the results for future queries
spice sql --cache-control no-cache
# Only use cached response if fresh for at least 30 more seconds
spice sql --cache-control min-fresh=30
# Accept cached responses stale for up to 60 seconds
spice sql --cache-control max-stale=60
# Only return cached responses, fail if cache miss
spice sql --cache-control only-if-cached
# Serve stale cache (up to 300 seconds) if fetching fails
spice sql --cache-control stale-if-error=300

# Default behavior (use cache if available)
spice search
# Same as above
spice search --cache-control cache
# Skip cache for this search, but cache the results for future searches
spice search --cache-control no-cache
# Accept stale search results up to 60 seconds old
spice search --cache-control max-stale=60

Custom Cache Keys

Set the Spice-Cache-Key header to supply a custom cache key. When set, a supplied cache key takes precedence over caching.sql_results.cache_key_type.

:::info[Info] A valid cache key consists of up to 128 alphanumeric characters (and the characters - and _). :::

HTTP Example

Consider the case of two semantically equivalent queries:

Time: 0.0251325 seconds. 2 rows.
sql> select * from users where org_id = 1;
+----+--------+-------+----------------+
| id | org_id | name  | email          |
+----+--------+-------+----------------+
| 1  | 1      | Jane  | jane@spice.ai  |
| 2  | 1      | Sarah | sarah@spice.ai |
+----+--------+-------+----------------+

Time: 0.008993042 seconds. 2 rows.
sql> select * from users where split_part(email, '@', 2) = 'spice.ai';
+----+--------+-------+----------------+
| id | org_id | name  | email          |
+----+--------+-------+----------------+
| 1  | 1      | Jane  | jane@spice.ai  |
| 2  | 1      | Sarah | sarah@spice.ai |
+----+--------+-------+----------------+

To share a cache key for these queries, set Spice-Cache-Key. The first request is a cache miss:

$ curl -i -XPOST http://localhost:8090/v1/sql -H"spice-cache-key: users_spiceai" -d "select * from users where org_id = 1;"
HTTP/1.1 200 OK
content-type: application/json
x-cache: Miss from spiceai
results-cache-status: MISS
vary: Spice-Cache-Key
vary: origin, access-control-request-method, access-control-request-headers
content-length: 119
date: Thu, 24 Jul 2025 14:15:53 GMT

[{"id":1,"org_id":1,"name":"Jane","email":"jane@spice.ai"},{"id":2,"org_id":1,"name":"Sarah","email":"sarah@spice.ai"}]

The subsequent request with the different (but semantically equivalent) query is a cache hit:

$ curl -i -XPOST http://localhost:8090/v1/sql -H"spice-cache-key: users_spiceai" -d "select * from users where split_part(email, '@', 2) = 'spice.ai';"
HTTP/1.1 200 OK
content-type: application/json
x-cache: Hit from spiceai
results-cache-status: HIT
vary: Spice-Cache-Key
vary: origin, access-control-request-method, access-control-request-headers
content-length: 119
date: Thu, 24 Jul 2025 14:18:00 GMT

:::warning[Note] When supplying a custom cache key, ensure the semantic equivalence of queries. For example, this is expected behavior:

$ curl -i -XPOST http://localhost:8090/v1/sql -H"spice-cache-key: users_spiceai" -d "select 1"
HTTP/1.1 200 OK
content-type: application/json
x-cache: Hit from spiceai
results-cache-status: HIT
vary: Spice-Cache-Key
vary: origin, access-control-request-method, access-control-request-headers
content-length: 119
date: Thu, 24 Jul 2025 14:21:32 GMT

[{"id":1,"org_id":1,"name":"Jane","email":"jane@spice.ai"},{"id":2,"org_id":1,"name":"Sarah","email":"sarah@spice.ai"}]

:::

Metrics

Cache metrics can be monitored using the Prometheus-compatible Metrics Endpoint. The following metrics are available for each cache type:

Metric	Type	Description
`*_cache_max_size_bytes`	Gauge	Maximum configured cache size in bytes.
`*_cache_requests`	Counter	Total number of cache lookup requests.
`*_cache_hits`	Counter	Total number of cache hits.
`*_cache_items_count`	Gauge	Current number of items in the cache.
`*_cache_size_bytes`	Gauge	Current cache size in bytes.
`*_cache_evictions`	Counter	Total number of cache evictions due to size or TTL limits.
`*_cache_hit_ratio`	Gauge	Current cache hit ratio (hits / total requests).

The * prefix corresponds to the cache type:

results_* - SQL query results cache metrics
search_results_* - Search results cache metrics
embeddings_* - Embeddings cache metrics

Example metrics output:

# HELP results_cache_evictions Number of cache evictions.
# TYPE results_cache_evictions counter
results_cache_evictions 2
# HELP results_cache_hit_ratio Cache hit ratio (hits / total requests).
# TYPE results_cache_hit_ratio gauge
results_cache_hit_ratio 0.625
# HELP results_cache_hits Cache hit count.
# TYPE results_cache_hits counter
results_cache_hits 14
# HELP results_cache_items_count Number of items currently in the cache.
# TYPE results_cache_items_count gauge
results_cache_items_count 1
# HELP results_cache_max_size_bytes Maximum allowed size of the cache in bytes.
# TYPE results_cache_max_size_bytes gauge
results_cache_max_size_bytes 134217728
# HELP results_cache_misses Cache miss count.
# TYPE results_cache_misses counter
results_cache_misses 4
# HELP results_cache_requests Number of requests to get a key from the cache.
# TYPE results_cache_requests counter
results_cache_requests 18
# HELP results_cache_size_bytes Size of the cache in bytes.
# TYPE results_cache_size_bytes gauge
results_cache_size_bytes 7776

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`caching` Parameters

Common Caching Parameters

`caching.sql_results` Parameters

Choosing a `cache_key_type`

Choosing a `hashing_algorithm`

Choosing an `encoding`

Per-Principal Cache Isolation

Cached Responses

Examples

Cached Response

Uncached Response

Bypassed Cache with `cache-control: no-cache`

Stale Cache Response (Stale-While-Revalidate)

Cache Control

Stale-While-Revalidate

Example Configuration

HTTP/Flight API

HTTP Example

Arrow FlightSQL Example

`spice` CLI

Custom Cache Keys

HTTP Example

Metrics

FilesExpand file tree

index.md

Latest commit

History

index.md

File metadata and controls

caching Parameters

Common Caching Parameters

caching.sql_results Parameters

Choosing a cache_key_type

Choosing a hashing_algorithm

Choosing an encoding

Per-Principal Cache Isolation

Cached Responses

Examples

Cached Response

Uncached Response

Bypassed Cache with cache-control: no-cache

Stale Cache Response (Stale-While-Revalidate)

Cache Control

Stale-While-Revalidate

Example Configuration

HTTP/Flight API

HTTP Example

Arrow FlightSQL Example

spice CLI

Custom Cache Keys

HTTP Example

Metrics

`caching` Parameters

`caching.sql_results` Parameters

Choosing a `cache_key_type`

Choosing a `hashing_algorithm`

Choosing an `encoding`

Bypassed Cache with `cache-control: no-cache`

`spice` CLI