Describe the Bug
Area: Session replay / recorder
Summary
A large share of our traffic arrives through in-app browsers (Instagram/Facebook WebViews on Android). For these visitors, session replays are almost always unusable: they show up with high action counts but 0:00 duration, and playback only shows the initial page render, not the rest of the session. Real desktop/mobile-Chrome sessions record fine.
After reading the recorder source, this looks like a consequence of how events are flushed, combined with WebViews not firing page-lifecycle events reliably.
Root cause
In src/recorder/index.js the time-based flush is implemented as a debounce that is reset on every emitted rrweb event:
const FLUSH_INTERVAL = 10000;
const scheduleFlush = () => {
if (flushTimer) clearTimeout(flushTimer); // reset on every event
flushTimer = setTimeout(flush, FLUSH_INTERVAL);
};
// inside record({ emit(event) { ... } }):
eventBuffer.push(event);
if (eventBuffer.length >= FLUSH_EVENT_COUNT) { // 100
flush();
}
scheduleFlush(); // called for every single event
So the buffer is only sent when one of these happens:
the buffer reaches
FLUSH_EVENT_COUNT (100 events), or
- 10s pass with no events at all (the debounce can only settle during idle), or
- visibilitychange → hidden / beforeunload fires (keepalive flush), or
maxDuration is reached.
For an actively interacting user, rrweb emits events continuously, so the debounce never settles and the only mid-session delivery is the coarse 100-event threshold. The final partial chunk (<100 events) depends entirely on the unload handlers in beginRecording():
document.addEventListener('visibilitychange', () => {
if (document.visibilityState === 'hidden') flush(true);
});
window.addEventListener('beforeunload', () => flush(true));
n Android in-app WebViews, beforeunload/visibilitychange are frequently not fired when the host app tears down or backgrounds the WebView, so that final flush never happens.
The most damaging case: the first chunk contains rrweb's full snapshot (type 2). If a WebView visitor bounces before any flush lands and no unload event fires, the entire recording — snapshot included — is lost, which is why so many sessions are unplayable / 0:00.
Impact
Session replay is effectively non-functional for in-app WebView traffic (a major share of paid-social/mobile audiences).
Even when a recording is stored, it often contains only the load-time mutation burst (clustered timestamps → 0:00 duration), giving a misleading picture in the replay list.
Suggested fix
Replace the per-event debounce with a true periodic interval started once when recording begins, and lower the interval. This sends buffered events while the page is still alive, so capture no longer depends on unload events firing:
const FLUSH_INTERVAL = 2000; // periodic, not debounced
// remove scheduleFlush() and its per-event call in emit()
// in beginRecording(), after record({...}):
flushTimer = setInterval(() => flush(), FLUSH_INTERVAL);
// in stop():
if (flushTimer) clearInterval(flushTimer);
As a complementary improvements i suggest to add a pagehide listener (more reliably fired than beforeunload on mobile WebKit/Chromium):
window.addEventListener('pagehide', () => flush(true));
Optionally also make the flush interval configurable via a data-flush-interval attribute (default could stay conservative) so high-traffic sites can tune request volume vs. capture fidelity.
With a periodic interval, at most the last ~2s of a session is ever lost, and the initial full snapshot reaches the server within the first interval, making WebView recordings reliably playable.
Notes / tradeoffs
A shorter interval increases request count (e.g. up to ~150 small POSTs over a 5-minute session). Keeping it at 2–3s and/or gating behind the existing sample rate keeps this reasonable; a configurable attribute would let operators decide.
We have not opened a PR because we haven't load-tested the change in our environment, but we're happy to if the team agrees with the direction.
Database
PostgreSQL
Relevant log output
Which Umami version are you using?
3.1.0 (observed in src/recorder/index.js, ref c78ff36)
How are you deploying your application?
self-hosted, PostgreSQL
Which browser are you using?
No response
Describe the Bug
Area: Session replay / recorder
Summary
A large share of our traffic arrives through in-app browsers (Instagram/Facebook WebViews on Android). For these visitors, session replays are almost always unusable: they show up with high action counts but 0:00 duration, and playback only shows the initial page render, not the rest of the session. Real desktop/mobile-Chrome sessions record fine.
After reading the recorder source, this looks like a consequence of how events are flushed, combined with WebViews not firing page-lifecycle events reliably.
Root cause
In src/recorder/index.js the time-based flush is implemented as a debounce that is reset on every emitted rrweb event:
So the buffer is only sent when one of these happens:
the buffer reaches
FLUSH_EVENT_COUNT(100 events), ormaxDurationis reached.For an actively interacting user, rrweb emits events continuously, so the debounce never settles and the only mid-session delivery is the coarse 100-event threshold. The final partial chunk (<100 events) depends entirely on the unload handlers in
beginRecording():n Android in-app WebViews, beforeunload/visibilitychange are frequently not fired when the host app tears down or backgrounds the WebView, so that final flush never happens.
The most damaging case: the first chunk contains rrweb's full snapshot (type 2). If a WebView visitor bounces before any flush lands and no unload event fires, the entire recording — snapshot included — is lost, which is why so many sessions are unplayable / 0:00.
Impact
Session replay is effectively non-functional for in-app WebView traffic (a major share of paid-social/mobile audiences).
Even when a recording is stored, it often contains only the load-time mutation burst (clustered timestamps → 0:00 duration), giving a misleading picture in the replay list.
Suggested fix
Replace the per-event debounce with a true periodic interval started once when recording begins, and lower the interval. This sends buffered events while the page is still alive, so capture no longer depends on unload events firing:
As a complementary improvements i suggest to add a pagehide listener (more reliably fired than beforeunload on mobile WebKit/Chromium):
Optionally also make the flush interval configurable via a
data-flush-intervalattribute (default could stay conservative) so high-traffic sites can tune request volume vs. capture fidelity.With a periodic interval, at most the last ~2s of a session is ever lost, and the initial full snapshot reaches the server within the first interval, making WebView recordings reliably playable.
Notes / tradeoffs
A shorter interval increases request count (e.g. up to ~150 small POSTs over a 5-minute session). Keeping it at 2–3s and/or gating behind the existing sample rate keeps this reasonable; a configurable attribute would let operators decide.
We have not opened a PR because we haven't load-tested the change in our environment, but we're happy to if the team agrees with the direction.
Database
PostgreSQL
Relevant log output
Which Umami version are you using?
3.1.0 (observed in src/recorder/index.js, ref c78ff36)
How are you deploying your application?
self-hosted, PostgreSQL
Which browser are you using?
No response