-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement initial regression detection #383
base: main
Are you sure you want to change the base?
Conversation
So that we don't repeat the pattern of: - do it wrong - test it and see it's wrong - look up the docs to wonder why it's wrong - fix it
Nice! Code changes look relatively compact too 👌
Neat! I don't fully understand how the streaks are computed though, is it some average +/- some delta percentage? Small details wrt the slack notification: When showing regressions, it would be nice to use the same capitalization formatting as the stat name eg |
# Iterate looking for contiguous streaks of values that are within the tolerance. | ||
(1...vals.size).each do |i| | ||
prev, curr = vals.values_at(i-1, i) | ||
delta = curr - prev | ||
|
||
# If this iteration is within the defined tolerance | ||
# from the last iteration track it as a streak. | ||
if delta.abs <= TOLERANCE | ||
# Keep track of the highest value that we've seen more than once in a row. | ||
max = [prev, curr].max | ||
high_streak = max if !high_streak || high_streak < max | ||
end | ||
|
||
# If we have seen any streaks (meaning there has been some consistency)... | ||
if high_streak | ||
delta = curr - high_streak | ||
|
||
# If this iteration was lower than the highest streak and | ||
# outside the tolerance range record it as a regression. | ||
regression = if delta < -TOLERANCE | ||
diff_pct = 0 - delta / high_streak * 100 | ||
sprintf "dropped %.*f%% from %.*f to %.*f", ROUND, diff_pct, ROUND, high_streak, ROUND, curr | ||
end | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the streak detection with tolerance
This shows it would have caught the regression last Sep (#366):
and in late November the number goes back up. |
COUNT = 30 | ||
NAME = :ratio_in_yjit | ||
ROUND = 2 | ||
TOLERANCE = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you comment on what is the unit of the tolerance?
What is count?
What is round?
|
||
# Check the list of values for one benchmark. | ||
# Returns either nil or string description of regression. | ||
def check_one(vals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it called check_one
?
prev, curr = vals.values_at(i-1, i) | ||
delta = curr - prev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But is this just comparing each value to the previous one?
What if you have a sequence of gradually decreasing values? 98.1, 98.0, 97.9, 97.8?
This sets up regression detection for the
ratio_in_yjit
metric.It looks for streaks to identify stable values.
It notifies for any given benchmark if in the last 30 days it's lower than a previous streak.
I included the data that goes into the analysis to help us fine tune the algorithm and to make it easier to decide if a notification is worth investigating.
This is what the slack notification looks like with recent data:
This is what the latest data from 2025-01-27 would report:
and here's one that doesn't currently register a regression (because the last value goes back up):
refs #366