-
Notifications
You must be signed in to change notification settings - Fork 11
Implement initial regression detection #383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Nice! Code changes look relatively compact too 👌
Neat! I don't fully understand how the streaks are computed though, is it some average +/- some delta percentage? Small details wrt the slack notification: When showing regressions, it would be nice to use the same capitalization formatting as the stat name eg |
This shows it would have caught the regression last Sep (#366):
and in late November the number goes back up. |
So that we don't repeat the pattern of: - do it wrong - test it and see it's wrong - look up the docs to wonder why it's wrong - fix it
dd429a6
to
73557a1
Compare
Get rid of the streak thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will let you decide when this should be merged and how it should be tested 👍
We now have data that is interesting besides the streaks.
This allows running the analysis for a finer grained result set than just "per month" (having to specify the month's subfolder).
With stddev * 0.01 this command bin/analysis --before=2025-01-08 --benchmarks=railsbench,lobsters build/raw-benchmark-data.prod/raw_benchmark_data/x86_64 finds regression: 99.82 is 0.00% below mean 99.82 which looks silly. With stddev * 0.02 that data set does not trigger. This value does still register some from last September: bin/analysis --before=2024-09-17 --benchmarks=railsbench,lobsters build/raw-benchmark-data.prod/raw_benchmark_data/x86_64 ratio_in_yjit x86_64_yjit_stats lobsters regression: 98.91 is 0.21% below mean 99.12 railsbench regression: 99.48 is 0.28% below mean 99.76
This sets up regression detection for the
ratio_in_yjit
metric using the idea ofI included the data that goes into the analysis to help us fine tune the algorithm and to make it easier to decide if a notification is worth investigating.
This is more or less what the slack notification would like:

refs #366