Skip to content

fix: resolve seek_to hang by routing time-based seeks via streaming store#1140

Open
SBALAVIGNESH123 wants to merge 2 commits intotimeplus-io:developfrom
SBALAVIGNESH123:fix/seek-to-streaming-store-routing
Open

fix: resolve seek_to hang by routing time-based seeks via streaming store#1140
SBALAVIGNESH123 wants to merge 2 commits intotimeplus-io:developfrom
SBALAVIGNESH123:fix/seek-to-streaming-store-routing

Conversation

@SBALAVIGNESH123
Copy link
Contributor

Fixes #558
When using a time-based seek_to like '-15m' on a stream with a large TTL (say 30 days), the query hangs indefinitely because the backfill path kicks in — it scans all historical MergeTree parts before streaming can begin, and ConcatStep blocks output until that entire scan is done. On a busy stream, this takes so long the query appears stuck.
The root cause is that getQueryMode() always returns StreamingConcat for time-based seeks when backfill is enabled, even when the NativeLog streaming store still has the requested data readily available.
This fix adds a simple check before entering StreamingConcat: we probe NativeLog using the existing sequencesForTimestamps() API to see if the streaming store still holds data for the requested time range. If it does, we convert the time-based seek into a sequence-based seek and use QueryMode::Streaming instead — completely skipping the expensive historical scan. If the data has been compacted away from NativeLog, we fall back to the existing StreamingConcat behavior with no regression.
This is the same pattern Kafka uses internally — offsetsForTimes() converts timestamps to offsets, and then the consumer seeks by offset.
Files changed:

  • StorageStream.h / StorageStream.cpp — Added tryResolveTimeSeekViaStreamingStore() to probe NativeLog
  • PruneShards.cpp — Added smart routing in getQueryMode() before entering StreamingConcat

…tore

When seek_to uses a relative or absolute time (e.g. '-15m') with
enable_backfill_from_historical_store enabled, the query enters
StreamingConcat mode which scans all historical MergeTree parts
before streaming begins, causing the query to appear hung.

This fix probes NativeLog first: if the streaming store still has
the requested data, convert the time-based seek to a sequence-based
seek and use QueryMode::Streaming, bypassing the expensive historical
scan entirely. Falls back to existing StreamingConcat when the data
has been compacted.

Fixes timeplus-io#558
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

seek_to not working as expected

1 participant