Skip to content

(WIP) Add benchmarks comparing performance of parse functions #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

duckinator
Copy link

Additional changes that were needed for benchmarking purposes, but may not be wanted:

  • adding pg_query_raw_parse to the bindings
  • exposing pg_query::bindings

@duckinator
Copy link
Author

duckinator commented Feb 14, 2025

Summary

I couldn't find a way to expose pg_query_raw_parse(), so I focused on the rest of the task.

For this query, pg_query_parse_protobuf() seems to take 5x as long as pg_query_parse().

Details

As I discussed with @seanlinsley on Slack, I couldn't find a way to expose pg_query_raw_parse(), so I set that aside since this work is time-boxed.

You can run cargo bench to get benchmark output. Unfortunately, in order to accommodate per-benchmark flamegraph generation, the cargo bench output is a bit messier.

cargo bench output:

     Running benches/parse.rs (target/release/deps/parse-e7cc8bbbe183b46f)
Starting: Running benchmark(s). Stand by!

•

Method                 Mean        Samples
------------------------------------------
pg_query_parse    922.41 μs    2,463/2,500

     Running benches/parse_protobuf.rs (target/release/deps/parse_protobuf-93fcb930536d0376)
Starting: Running benchmark(s). Stand by!

•

Method                        Mean        Samples    Change
-----------------------------------------------------------
pg_query_parse_protobuf    5.03 ms    1,934/1,960    -0.71%


Flamegraphs

pg_query_parse():

flamegraph-parse

pg_query_parse_protobuf():

flamegraph-parse_protobuf

@duckinator
Copy link
Author

The initial flamegraph for pg_query_parse_protobuf() was wrong. Regenerating it fixed the problem, and I uploaded a new one + updated my summary accordingly.

brunch::benches!(
Bench::new("pg_query_parse_protobuf")
.run_seeded_with(c_seed, |query| {
unsafe { pg_query_parse_protobuf(query.as_ptr() as *const c_char) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be interesting to include the deserialization in Rust as well, so we can get a sense for where to optimize (if we were to optimize the serialization itself).

For the protobuf parse benchmark we can use the same mechanism the crate currently uses. For JSON we could (just for testing) use the mechanism that pg_parse uses (which shares a common history with this crate, but we since diverged to focus on the Protobuf format).

@vrmiguel
Copy link

Probably out of scope of this PR but a cool addition would be comparisons to datafusion-sqlparser-rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants