-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement cache rollup task using TA timeseries models #1135
Conversation
✅ Sentry found no issues in your recent changes ✅ |
Codecov ReportAll modified and coverable lines are covered by tests ✅
✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## main #1135 +/- ##
=======================================
Coverage 97.72% 97.73%
=======================================
Files 449 451 +2
Lines 36866 36947 +81
=======================================
+ Hits 36028 36110 +82
+ Misses 838 837 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
CacheTestRollupsTask().run_impl( | ||
_db_session=None, | ||
repoid=1, | ||
branch="main", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you forgot the test assertions for this case :-)
serialized_table: BytesIO | ||
|
||
if branch: | ||
if branch in {"main", "master", "develop"}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Repository
has a dedicated field for the main branch, we should probably use that instead of hardcoding a set here.
If you still want to hardcode the list, better to define it as a top level const so its more discoverable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Repository has a dedicated field for the main branch, we should probably use that instead of hardcoding a set here.
it would make sense to do this, except we want to use the continuous aggregates, and to do this we would need to access the repo.branch
from timescale, which isn't possible right now. I have ideas on how to do this in the future, but for now, I think this will be good enough for most of our users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that does make sense, yes.
maybe a boolean is_main_branch
or something that you feed in when processing, at which time you have access to the repo metadata.
we want to make it so the cache rollup task is capable of reading test analytics information from the timeseries db this also changes the format of the dataframe being cached, so we'll also change the format of the path at which we will store the cached dataframe the logic for reading from the timeseries db is: - if no branch is specified -> read from the repo wide continuous aggs - if a branch is specified - if it's one of the more popular main branch names -> read from the branch scoped continuous aggregates - else, directly aggregate from the individual testruns
86cfa1f
to
36e33aa
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅ ✅ All tests successful. No failed tests found. 📢 Thoughts on this report? Let us know! |
we want to make it so the cache rollup task is capable of reading test analytics information from the timeseries db
this also changes the format of the dataframe being cached, so we'll also change the format of the path at which we will store the cached dataframe
the logic for reading from the timeseries db is: