fix: Snapshots as starting point for eventsBySlices, take 2 by patriknw · Pull Request #717 · akka/akka-persistence-r2dbc

patriknw · 2026-01-19T14:35:02Z

The initial implementation suffered from two problems
1. Not all entites have snapshots, and then it would miss those events when adjusting the start offset
2. When emitting all snapshots first, followed by events, the offset storage can be two far ahead and events might be missed when restarting from stored offsets
Instead, this implementation drives the stream from the ordinary eventsBySlices query.
- reads snapshot sequence number on first occurance of a persistence id, and keeps that in memory
- loads snapshot and emits when corresponding event sequence number is seen
- skips event with sequence number lower than the snapshot sequence number
Means that it must still read all events, but the benefit is that it doesn't have to process events before snapshots

* The initial implementation suffered from two problems 1. Not all entites have snapshots, and then it would miss those events when adjusting the start offset 2. When emitting all snapshots first, followed by events, the offset storage can be two far ahead and events might be missed when restarting from stored offsets * Instead, this implementation drives the stream from the ordinary eventsBySlices query. - reads snapshot sequence number on first occurance of a persistence id, and keeps that in memory - loads snapshot and emits when corresponding event sequence number is seen - skips event with sequence number lower than the snapshot sequence number * Means that it must still read all events, but the benefit is that it doesn't have to process events before snapshots

pvlugter

Looks good.

Deeper change, but can we lazy load the event payloads, like for backtracking, so that we can process all the filtered events more efficiently?

patriknw · 2026-01-20T07:50:25Z

Deeper change, but can we lazy load the event payloads, like for backtracking

That would be good for the catchup scenario, but probably not for the normal processing. Not sure we should complicate that with some adaptive hybrid approach. At least, let's try this first with some realistic data.

I made one optimization for loading only the seqNr of the snapshot first, and then load the full snapshot when it's emitted: de75603

johanandren

Looks good. Is there something around what happens with a running system where new snapshots happen while the replay is ongoing that needs to be thought of/tested?

core/src/main/resources/reference.conf

johanandren · 2026-01-20T12:06:56Z

ddl-scripts/create_tables_postgres.sql


-- `snapshot_slice_idx` is only needed if the slice based queries are used together with snapshot as starting point
-CREATE INDEX IF NOT EXISTS snapshot_slice_idx ON snapshot(slice, entity_type, db_timestamp);
-


Add something in the migration guide about dropping the index, or do we expect nobody was using this yet?

pvlugter · 2026-01-20T23:33:05Z

Deeper change, but can we lazy load the event payloads, like for backtracking

That would be good for the catchup scenario, but probably not for the normal processing. Not sure we should complicate that with some adaptive hybrid approach. At least, let's try this first with some realistic data.

Start from snapshots is most useful for rebuilds, so optimising this further could make sense. If we did find that useful, could do something that combines something like the old approach with the new approach. Old approach did all the snapshots first, then regular event queries, which doesn't work. But we could take the idea of first doing an initial phase — where we process all the earlier events in a filtered lazy load way and the initial snapshots, until we pass the max snapshot timestamp — and then switch fully over to regular event processing after that. So rather than processing all snapshots and then start event processing from min snapshot timestamp, process catch up until max snapshot timestamp to initialise the projection.

* because then not depending on db migration of the db_timestamp

johanandren · 2026-01-26T14:56:29Z

core/src/main/scala/akka/persistence/r2dbc/internal/StartingFromSnapshotStage.scala

+              // we can't ignore it when snapshot is not emitted, because it might have been emitted in
+              // previous incarnation, but then the stream was restarted
+              push(out, env)
+            } else if (!s.emitted && env.sequenceNr == s.seqNr) {


How is it with event deletion, we only delete up to before sequence number, not the sequence nr of the snapshot? (Or else this will not work together with deletion)

We as in Akka runtime/sdk don't delete events, until the whole entity is deleted, and then all events and snapshots are deleted, much later.

I'm not sure if snapshotting and retention strategies in EventSourcedBehavior can be setup to delete event seqNr == snapshot seqNr. However, deleting events combined with projections requires considerations anyway, and we have that documented somewhere.

Alright, as long as we warn a bit about it somewhere that might be good enough.

johanandren · 2026-01-26T15:08:09Z

core/src/main/scala/akka/persistence/r2dbc/internal/StartingFromSnapshotStage.scala

+          } else {
+            // snapshot will be emitted later, ignore event
+            updateState(snap.persistenceId, snap.seqNr, emitted = false)
+            tryPullOrComplete()


I think this covers my previous (vague) concern: if there was a new snapshot taken after the result of seqNrOfCorrespondingSnapshot, that means this load returns a newer snapshot than the env triggering it and we'll end up here. Seems fine.

patriknw · 2026-01-19T14:41:35Z

core/src/test/scala/akka/persistence/r2dbc/query/StartingFromSnapshotStageSpec.scala

+    new GraphStageLogic(shape) with InHandler with OutHandler { self =>
+      private implicit def ec: ExecutionContext = materializer.executionContext
+
+      private var snapshotState = Map.empty[String, SnapshotState]


I still need to implement some kind of eviction of this.

patriknw · 2026-01-28T13:39:33Z

core/src/main/scala/akka/persistence/r2dbc/internal/StartingFromSnapshotStage.scala

+              // we can't ignore it when snapshot is not emitted, because it might have been emitted in
+              // previous incarnation, but then the stream was restarted
+              push(out, env)
+            } else if (!s.emitted && env.sequenceNr == s.seqNr) {


We as in Akka runtime/sdk don't delete events, until the whole entity is deleted, and then all events and snapshots are deleted, much later.

I'm not sure if snapshotting and retention strategies in EventSourcedBehavior can be setup to delete event seqNr == snapshot seqNr. However, deleting events combined with projections requires considerations anyway, and we have that documented somewhere.

patriknw · 2026-01-28T16:39:31Z

core/src/main/scala/akka/persistence/r2dbc/internal/StartingFromSnapshotStage.scala

+            updateState(env.persistenceId, seqNr, emitted = false)
+            push(env)
+          } else if (filterCount >= heartbeatAfter) {
+            pushHeartbeat()


I have added heartbeats when many events have been filtered out. That will give observability progress and offset storage progress downstreams. I have to deal with these heartbeats in projections. As is, they would be like filtered events, and then probably handled as duplicates in offset store. 8847b84

Nice, useful to have the heartbeats.

pvlugter

LGTM. Updated approach looks good. Just a question around the state updates for heartbeats, and possible simplification of the logic here.

pvlugter · 2026-01-29T03:39:17Z

core/src/main/scala/akka/persistence/r2dbc/internal/StartingFromSnapshotStage.scala

+          } else if (filterCount >= heartbeatAfter) {
+            pushHeartbeat()
+          } else {
+            // snapshot will be emitted later, ignore event
+            updateState(env.persistenceId, seqNr, emitted = false)
+            ignore(env)
+          }


In places that we're pushing a heartbeat, rather than ignoring the envelope, should it also call updateState? Similar for loadSnapshotCallback. And also the update of the latest timestamp in ignore. Would seem that we should always do the same as ignoring/filtering, but also push the heartbeat. Could be simpler to not repeat logic, and move the heartbeat push into the ignore method?

pvlugter · 2026-01-29T03:49:13Z

core/src/main/scala/akka/persistence/r2dbc/internal/StartingFromSnapshotStage.scala

+      private def ignore(env: EventEnvelope[Event]): Unit = {
+        filterCount += 1
+        updateLatestTimestamp(env)
+        tryPullOrComplete()
+      }


Could be simpler to remove the pushHeartbeat method and only have the ignore method to call from other places, and move the heartbeat logic into this method?

private def ignore(env: EventEnvelope[Event]): Unit = { updateLatestTimestamp(env) if (filterCount >= heartbeatAfter) { filterCount = 1L push(out, createHeartbeat(latestTimestamp)) } else { filterCount += 1 tryPullOrComplete() } }

Good suggestion, thanks. 8917f32

pvlugter · 2026-01-29T03:54:32Z

It is EventsBySliceStartingFromSnapshotSpec failing in the SQL Server tests, so could be something that needs an adjustment.

patriknw · 2026-01-29T06:44:32Z

eventsBySliceStartingFromSnapshotSpec failing in the SQL Server tests

SQL Server tests (or underlying r2dbc implementation) are flaky. This failed in the setup phase of the test, when persisting events, so likely not related to this PR.

* easier than trying to adjust in when storing offsets

patriknw · 2026-01-29T13:36:43Z

core/src/main/scala/akka/persistence/r2dbc/query/scaladsl/R2dbcReadJournal.scala

  private val heartbeatPersistenceIds = new ConcurrentHashMap[(String, Int), String]()
  private val heartbeatUuid = UUID.randomUUID().toString
+  // gaps are allowed for heartbeat sequence numbers, but increasing for each heartbeat pid (uuid makes it unique)
+  private val heartbeatSeqNr = new AtomicLong


first I thought I could always use seqNr 1 for the hearbeats and adjust them when storing the offsets, but that is just difficult

Offset validation for heartbeats in akka/akka-projection#1413
With the increasing sequence number the offset can just be stored, as any other.

Looks good. Increasing seq number on this side seems better.

Would be good to confirm the end-to-end behaviour again, with some larger tests.

patriknw added 3 commits January 19, 2026 15:23

copy pasta

2c8ba76

evict cache

5fc6ccf

patriknw marked this pull request as ready for review January 19, 2026 16:34

pvlugter reviewed Jan 20, 2026

View reviewed changes

patriknw added 2 commits January 20, 2026 07:57

one more assert of the cache eviction

cfd3462

read seqNr first, later load full snapshot

de75603

patriknw added 3 commits January 20, 2026 08:51

mima filter

7681d58

SqlServerSnapshotDao

1fdc83b

Long

94c3cd2

johanandren reviewed Jan 20, 2026

View reviewed changes

patriknw added the bug Something isn't working label Jan 23, 2026

use offset from event envelope

bf5dbae

* because then not depending on db migration of the db_timestamp

johanandren reviewed Jan 26, 2026

View reviewed changes

patriknw commented Jan 28, 2026

View reviewed changes

johanandren approved these changes Jan 28, 2026

View reviewed changes

emit heartbeat after many filtered events

8847b84

patriknw commented Jan 28, 2026

View reviewed changes

pvlugter approved these changes Jan 29, 2026

View reviewed changes

simplify pushHeartbeat/ignore

8917f32

johanandren approved these changes Jan 29, 2026

View reviewed changes

increasing heartbeat sequence number

7e3b324

* easier than trying to adjust in when storing offsets

patriknw mentioned this pull request Jan 29, 2026

feat: Offset validation of heartbeat envelopes akka/akka-projection#1413

Merged

patriknw commented Jan 29, 2026

View reviewed changes

patriknw merged commit 2c86810 into main Feb 6, 2026
15 checks passed

patriknw deleted the wip-start-snapshot2-patriknw branch February 6, 2026 09:52

patriknw added this to the 1.3.12 milestone Feb 6, 2026

patriknw mentioned this pull request Feb 6, 2026

bump: akka-persistence-r2dbc 1.3.12 akka/akka-projection#1414

Merged


		-- `snapshot_slice_idx` is only needed if the slice based queries are used together with snapshot as starting point
		CREATE INDEX IF NOT EXISTS snapshot_slice_idx ON snapshot(slice, entity_type, db_timestamp);

Conversation

patriknw commented Jan 19, 2026

Uh oh!

pvlugter left a comment

Choose a reason for hiding this comment

Uh oh!

patriknw commented Jan 20, 2026

Uh oh!

johanandren left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pvlugter commented Jan 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pvlugter left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pvlugter commented Jan 29, 2026

Uh oh!

patriknw commented Jan 29, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants