-
Notifications
You must be signed in to change notification settings - Fork 143
ci: Add codspeed for performance monitoring #2516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
CodSpeed Performance ReportCongrats! CodSpeed is installed π
You will start to see performance impacts in the reports once the benchmarks are run from your default branch.
|
Ok, so:
Warning This benchmark contains 32 system calls, totalling 39.1 s of execution time. Since they cannot be consistently instrumented, those calls are not included in the measure. Please switch to the Walltime instrument to accurately measure system calls. Learn more about measurement and system calls. which to me indicates that the numbers in the report here are not tracking what we would like to see. Additionally, we don't get the split by backend, which is also something I would like to see if we integrate performance tooling. |
@@ -8,10 +8,11 @@ | |||
import pyarrow.csv as pc | |||
import pyarrow.parquet as pq | |||
|
|||
if not Path("data").exists(): | |||
Path("data").mkdir() | |||
|
|||
SCALE_FACTOR = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #972 we were using TPCH with 0.25 ratio and it was taking ~40mins to run IIRC. That's a bit much for what I would consider fast iteration - maybe a ratio of 0.1 is more reasonable to start with
IIRC the docs for the duckdb TPCH tests used
0.01
- so we can go lower
I found the bit in the docs that used 0.01
(https://duckdb.org/docs/0.10/extensions/tpch#listing-expected-answers)
To produced the expected results for all queries on scale factors 0.01, 0.1, and 1, run
If we can run these with 10x less data, surely we should right?
The current run has been going for almost 2 hours π
(https://github.com/narwhals-dev/narwhals/actions/runs/15098359607/job/42436026213?pr=2516)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current run has been going for almost 2 hours π
Yes I have been monitoring it - it's a bit odd, isn't it? I am not fully sure what's going on π€
What type of PR is this? (check all applicable)
Related issues
Checklist
If you have comments or can explain your changes, please do so below