Description
Hi!
I do ongoing PGO research on different applications - all results are available at https://github.com/zamazan4ik/awesome-pgo . I performed some PGO benchmarks on the ada-url
library and want to share my results here.
Test environment
- Fedora 39
- Linux kernel 6.7.6
- AMD Ryzen 9 5900x
- 48 Gib RAM
- SSD Samsung 980 Pro 2 Tib
- Compilers - Rustc 1.76 and Clang 17.0.6
- Ada-url version: Rust bindings on
main
branch on commitbe5c26098f3cec1679dbb453d22542c9a7b10902
- Disabled Turbo boost for improving consistency across runs
Benchmark
The release benchmark is done with taskset -c 0 cargo bench
, PGO training phase - with CXX=clang++ CXXFLAGS="-fprofile-generate=pgo_profiles_clang" cargo pgo bench
, PGO-optimized results - with export CXX=clang++ && export CXXFLAGS=ada.profdata && taskset -c 0 cargo pgo optimize bench
.
taskset -c 0
is used for better benchmark consistency. PGO profiles for Clang I got with llvm-profdata
tool (more details could be found in the Clang documentation).
In all tests, the Clang compiler is used. All PGO-related routines are done with cargo-pgo. All benchmarks are done on the same machine, with the same hardware/software during runs, with the same background "noise" (as much as I can guarantee, of course).
Results
Here are the results:
- Release: https://gist.github.com/zamazan4ik/445260656ae67dafce7c2c4e06f83154
- PGO-optimized: https://gist.github.com/zamazan4ik/5a82715d041e849d73dc9dfb1f814544
- (just for reference) PGO instrumented: https://gist.github.com/zamazan4ik/f5959d668063103f5a957af6b4ac206c
For anyone wondering, the improvement comes from optimizing with PGO the Rust part or C++ part, I also performed the PGO test only for the C++ part. It's done via passing -fprofile-use
flag via CXXFLAGS
but running benchmarks with cargo bench
(so no PGO-optimization for the Rust part). The results: https://gist.github.com/zamazan4ik/975036b1cd4ede6e4e6eeab2146934e1 . The benchmark confirms that C++ performance is improved with PGO.
At least in the provided by project benchmarks, there are measurable improvements. Not sure if should I create the PGO performance report in the main ada-url
repo or not - it's up to the maintainers :)
Please do not treat the issue as a bug - it's just a performance report. If maintainers agree that building the library with PGO can be valuable for the users, maybe mentioning PGO building somewhere in the README will be a good idea to consider.
Activity