-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: sampling that can be shared #1700
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR introduces a unified sampling mechanism across PostHog JS, making session ID sampling consistent between before_send and session recording functionalities.
- Added new
/src/extensions/sampling.ts
with deterministic hashing algorithm for consistent sampling decisions - Modified session recording sampling in
/src/extensions/replay/sessionrecording.ts
to usesampleOnProperty
instead ofMath.random()
- Refactored sampling utilities from
before-send.ts
into shared implementation for better code reuse - Added tests in
/src/__tests__/utils/before-send-utils.test.ts
to verify consistent sampling behavior across different methods - Ensures events are only captured for sessions that have recordings when both sampling methods are used
5 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile
export function sampleOnProperty(prop: string, percent: number): boolean { | ||
return simpleHash(prop) % 100 < clampToRange(percent * 100, 0, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: No input validation for percent parameter. Should verify percent is between 0 and 1 before using clampToRange.
export function sampleOnProperty(prop: string, percent: number): boolean { | |
return simpleHash(prop) % 100 < clampToRange(percent * 100, 0, 100) | |
export function sampleOnProperty(prop: string, percent: number): boolean { | |
if (percent < 0 || percent > 1) { | |
throw new Error('percent must be between 0 and 1'); | |
} | |
return simpleHash(prop) % 100 < clampToRange(percent * 100, 0, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's what clamp to range is doing, silly robot
export function updateThreshold(currentValue: number | undefined, percent: number): number { | ||
return (isUndefined(currentValue) ? 1 : currentValue) * percent | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: updateThreshold could return Infinity if percent is very large and multiplied multiple times. Consider adding upper bound check.
e9b15bd
to
3a17320
Compare
Size Change: +1.22 kB (+0.04%) Total Size: 3.28 MB
ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
the suggested implementation of session id sampling for
before_send
is different to that used in session recordingthe one in
before_send
is nicer as its deterministic for each session idby switching to the
before_send
implementation someone can set a sample rate for session recording, and sample events by session id inbefore_send
and only get events for sessions that have recordings 🦾(suggested by @camerondeleone in slack somewhere)