Skip to content

ci: Add codspeed for performance monitoring #2516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

FBruzzesi
Copy link
Member

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

Copy link

codspeed-hq bot commented May 8, 2025

CodSpeed Performance Report

Congrats! CodSpeed is installed πŸŽ‰

πŸ†• 22 new benchmarks were detected.

You will start to see performance impacts in the reports once the benchmarks are run from your default branch.

Detected benchmarks

  • test_benchmark_scripts[query_path0] (323.5 Β΅s)
  • test_benchmark_scripts[query_path10] (322.9 Β΅s)
  • test_benchmark_scripts[query_path11] (325 Β΅s)
  • test_benchmark_scripts[query_path12] (327.3 Β΅s)
  • test_benchmark_scripts[query_path13] (319.7 Β΅s)
  • test_benchmark_scripts[query_path14] (320.7 Β΅s)
  • test_benchmark_scripts[query_path15] (321 Β΅s)
  • test_benchmark_scripts[query_path16] (323 Β΅s)
  • test_benchmark_scripts[query_path17] (319.8 Β΅s)
  • test_benchmark_scripts[query_path18] (320.7 Β΅s)
  • test_benchmark_scripts[query_path19] (319.5 Β΅s)
  • test_benchmark_scripts[query_path1] (320.4 Β΅s)
  • test_benchmark_scripts[query_path20] (320.6 Β΅s)
  • test_benchmark_scripts[query_path21] (375.8 Β΅s)
  • test_benchmark_scripts[query_path2] (319.8 Β΅s)
  • test_benchmark_scripts[query_path3] (322.5 Β΅s)
  • test_benchmark_scripts[query_path4] (321.9 Β΅s)
  • test_benchmark_scripts[query_path5] (320.2 Β΅s)
  • test_benchmark_scripts[query_path6] (319 Β΅s)
  • test_benchmark_scripts[query_path7] (319 Β΅s)
  • test_benchmark_scripts[query_path8] (325.1 Β΅s)
  • test_benchmark_scripts[query_path9] (319.2 Β΅s)

@FBruzzesi
Copy link
Member Author

Ok, so:

  • even with 10% of data, it takes 30 mins to run
  • in the codspeed website I can see that these benchmarks come with a warning:

Warning

This benchmark contains 32 system calls, totalling 39.1 s of execution time. Since they cannot be consistently instrumented, those calls are not included in the measure. Please switch to the Walltime instrument to accurately measure system calls. Learn more about measurement and system calls.

which to me indicates that the numbers in the report here are not tracking what we would like to see.

Additionally, we don't get the split by backend, which is also something I would like to see if we integrate performance tooling.
To do that we would need to have the benchmark on a lower level such as in execute_query

@@ -8,10 +8,11 @@
import pyarrow.csv as pc
import pyarrow.parquet as pq

if not Path("data").exists():
Path("data").mkdir()

SCALE_FACTOR = 0.1
Copy link
Member

@dangotbanned dangotbanned May 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FBruzzesi (#805 (comment))

In #972 we were using TPCH with 0.25 ratio and it was taking ~40mins to run IIRC. That's a bit much for what I would consider fast iteration - maybe a ratio of 0.1 is more reasonable to start with

IIRC the docs for the duckdb TPCH tests used 0.01 - so we can go lower

I found the bit in the docs that used 0.01 (https://duckdb.org/docs/0.10/extensions/tpch#listing-expected-answers)

To produced the expected results for all queries on scale factors 0.01, 0.1, and 1, run

If we can run these with 10x less data, surely we should right?

The current run has been going for almost 2 hours πŸ˜…
(https://github.com/narwhals-dev/narwhals/actions/runs/15098359607/job/42436026213?pr=2516)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current run has been going for almost 2 hours πŸ˜…

Yes I have been monitoring it - it's a bit odd, isn't it? I am not fully sure what's going on πŸ€”

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants