Skip to content

Conversation

@vmx
Copy link
Contributor

@vmx vmx commented Nov 17, 2025

Copy the bytes into a buffer instead of destructuring them. On my machine it's an about 15% performance improvement.

A benchmark was added, which can be run via

cargo bench --bench f64_enc --features use_alloc

I've only optimized the f64 code path as this is what I discovered, when I was switching a project over to cbor4ii. If other encodings should be changed/benchmarked as well, let me know.

@vmx vmx force-pushed the improve-f64-enccoding branch from 7ac5a55 to e4fb0a0 Compare November 17, 2025 01:28
@codecov
Copy link

codecov bot commented Nov 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.14%. Comparing base (30fe080) to head (f3ece39).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #51      +/-   ##
==========================================
+ Coverage   85.82%   86.14%   +0.31%     
==========================================
  Files          11       11              
  Lines        1912     1920       +8     
==========================================
+ Hits         1641     1654      +13     
+ Misses        271      266       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@vmx vmx force-pushed the improve-f64-enccoding branch from e4fb0a0 to cacd80b Compare November 17, 2025 01:40
@quininer
Copy link
Owner

quininer commented Nov 17, 2025

Interesting, I thought LLVM could optimize it. would you mind improve u64 as well?

https://godbolt.org/z/E5Pdx7jjT

@vmx vmx force-pushed the improve-f64-enccoding branch from cacd80b to 5912391 Compare November 17, 2025 11:25
@vmx
Copy link
Contributor Author

vmx commented Nov 17, 2025

I went all in and changed it for all numeric types, as even for 16-bit, there are less assembly instructions (though pobably not measurable: https://godbolt.org/z/ov97Pq43j

I did it for all as I'm a fan of code symmetry. If I should only do it for the 64-bit types, I'm happy to change it again.

I've also removed the benchmark, as it was for illustration purpose only and I don't think it makes sense to keep it around (less bit rot).

Copy the bytes into a buffer instead of destructuring them. On my machine
it's an about 15% performance improvement for f64.

For all types, even if it's 16-bit only, the resulting assembly has less
instructions: https://godbolt.org/z/ov97Pq43j
@vmx vmx force-pushed the improve-f64-enccoding branch from 5912391 to f3ece39 Compare November 17, 2025 11:41
@quininer quininer merged commit e32646c into quininer:master Nov 17, 2025
4 checks passed
@quininer
Copy link
Owner

Thank you!

@vmx vmx deleted the improve-f64-enccoding branch November 17, 2025 14:26
@vmx
Copy link
Contributor Author

vmx commented Nov 17, 2025

Thanks for merging.

I'm still looking into another encoding bottleneck. It could well be that it's not a cbor4ii thing, but related to other parts of the library. I just wanted to mention as you might want to wait for a new release until all the changes I'm currently working on are in.

@vmx
Copy link
Contributor Author

vmx commented Nov 17, 2025

The other bottleneck was something else. So don't expect any perf related PRs for now.

For anyone who's reading this and might be interested what the other performance things was. I was using the BufWriter from cbor4ii. That is slower than the one from the Rust standard library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants