Result Rows Caching #57

winsmith · 2025-04-25T11:25:43Z

This is an attempt to try out various ways of caching individual result rows for query results.

Time series query results have a granularity as defined by the query, and each row has a timestamp
Top N query results have a granularity as defined by the query, and each row has a timestamp
GroupBy query results have a granularity as defined by the query, and each row has a timestamp

so these are possible candidates for a way of caching where we cache individual rows and only calculate the ones that are missing or outdated. (The druid server could more accurately tell which rows are outdated, but we're ignoring that for the sake of experiment and simplicity.)

Process

a query comes in, it contains at least one relative interval or absolute interval
we generate IntervalIndependentHash, a hash from a copy of the query where we remove all intervals, because intervals are irrelevant for this type of caching
we generate a list of all time segments we need to fulfill the query inside its intervals
for each time segment, we query the cache for IntervalIndependentHash + granularity + window + iso8601 date for existing rows
we generate new intervals for all missing rows
we run a query with these intervals
we store all rows that we don't deem volatile in the cache
we build and return a full query result

This can be enhanced later with windowed caching where we cache complete results for fixed, non-overlapping time windows (e.g. per-day, per-week blocks).

Tasks

implement Query.intervalIndependentHash
implement TimeInterval.timeSegments(with: granularity)
implement methods of generating new time intervals from old timeintervals minus time segments (should time segments be their own struct?)
implement combining of query results

…vals minus time segments

winsmith added 4 commits April 25, 2025 13:19

Break out query result types into different categories

8ad2a39

implement Query.intervalIndependentHash

76b2037

implement TimeInterval.timeSegments(with: granularity)

b300db8

implement methods of generating new time intervals from old timeinter…

8949edd

…vals minus time segments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Result Rows Caching #57

Result Rows Caching #57

Uh oh!

winsmith commented Apr 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

Result Rows Caching #57

Are you sure you want to change the base?

Result Rows Caching #57

Uh oh!

Conversation

winsmith commented Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Process

Tasks

Uh oh!

Uh oh!

winsmith commented Apr 25, 2025 •

edited

Loading