-
Notifications
You must be signed in to change notification settings - Fork 90
performance(runtime)!: Copy-on-Write value optimization #1349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi @Zettroke, thank you for submitting this. I added it to my list for review. Is this a breaking change for users or for developers only? |
Hi @pront, no it's a breaking change only in terms of crate API. VRL itself is unchanged. |
src/compiler/expression/array.rs
Outdated
@@ -12,12 +12,26 @@ use crate::{ | |||
|
|||
#[derive(Debug, Clone, PartialEq)] | |||
pub struct Array { | |||
resolved: Option<Value>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard would it be to separate out the changes to cache the resolved value of these array and object expressions? On first reading, I thought this was part of the Value CoW optimization and then realized it was doing something different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I just wanted to show bigger gains ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me, if the numbers look good, except for the issue of merging the two changes together and a couple of naming nits. It's too bad it means having to .into()
everything, but that seems unavoidable.
Summary
This request introduces CoW logic into vrl
Value
. It aims to allow passingValue
between function by reference without the need for copies. It also makes usingresolve_constant
to precompute expressions viable, just makingObject
andArray
expressions use it yielded 15% more throughput.This request requires vectordotdev/vector#22693 to work in vector.
Change Type
Implementation
All of it is based on
Arc::make_mut
API. To reduce the number of code changes I decided to use customDerefMut
which calls toArc::make_mut
. While it can potentially lead to excessive copies it's still strictly better than before. But I will understand if you want a more explicit API for this.I decided to wrap it up with Arc-only
Object
andArray
variants. This avoids indirection for trivially copyable scalar values, andBytes
are already ref-counted. Also, these variants already include an indirection, so overhead is not that noticeable.Benchmarks
I used our set of 120+ VRL transforms used for event normalization.
On average this change yielded almost 2x speedup in pure vrl benchmarks.
I attached the full table with results, sadly I'm not allowed to provide sources of our VRL transforms, so I kindly ask you to benchmark on your rule-sets to see speedups in other use cases
Benchmark results
Google sheets
Is this a breaking change
How did you test this PR?
We used included unit tests + we already used this change in our vector load-testing setup.
Does this PR include user facing changes?
our guidelines.
Checklist
run
dd-rust-license-tool write
and commit the changes. More details here.