-
Notifications
You must be signed in to change notification settings - Fork 292
CP-307933: avoid serializing/deserializing database fields all the time #6462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6b903b0
to
d5eef83
Compare
Testing looks good, there was a failure regarding XML backward compat in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to remove the commits from the already-merged PR
So we can call this from more generic code, without worrying whether locks are held. Skip adding to the stringpool when locks aren't held and we can't obtain an atomic lock either. No functional change Signed-off-by: Edwin Török <[email protected]>
This needs to be called around the same places that `ensure_utf8_xml` and `UTF8_XML.is_valid` gets called: those are the places where new strings can enter the system. Now that calling Share.merge is thread-safe also call it when writing new fields, not just when creating the row. For now the DB lock is still held when this function is called, but future refactoring might move code around. For maps and sets we only call this when new keys get added (except in create_row, where we need to process all elements) Signed-off-by: Edwin Török <[email protected]>
* schema access, similar to write_field: we can access it outside of create_row_locked, reducing the time we hold the DB lock. * debug writes * UTF8 validation and StringPool operations We cannot move field reads in read/write pairs because we need to ensure noone else has changed the DB after we've read it. (There are other alternatives if we used lock-free data structures, we could optimistically perform the operation on a snapshot, and check whether the snapshot is still the same when committing, and if not repeat the operation. But for now make just simple changes). Signed-off-by: Edwin Török <[email protected]>
Stores a Value.t and its marshaled form. A phantom type distinguishes between situations where Value.t is always present, from situations where it may be absent. For now it is always present, but may become absent when we start using this for `Db_interface.field`. Document complexity of functions (most are `O(1)`, except set/map at construction time, and unmarshal if the backwards compatible construction from string is used). In particular operations on strings are `O(1)`, because they don't look at the actual string value, just wrap or unwrap it from a `Value.String` variant. set/map requires serialization in S-expression form, but this is only done once at construction time, and we can retrieve it multiple times in `O(1)`. No functional or performance change Signed-off-by: Edwin Török <[email protected]>
Will make it easier to introduce UTF8 validation here. No functional change Signed-off-by: Edwin Török <[email protected]>
field=string for now, but will be changed to Schema.CachedValue.t later. There are separate types for input and output fields, e.g. we may want to perform UTF8 validation on input, which would modify both Value.t and its serialized form, so we can avoid serializing twice. No functional change Signed-off-by: Edwin Török <[email protected]>
This provides direct access to a Value.t when known, which avoids having to serialize and deserialize a Value.t just to read it. Not yet used, no functional change. Signed-off-by: Edwin Török <[email protected]>
No functional change, we still use the Compat module Signed-off-by: Edwin Török <[email protected]>
`ministat` confirms a speedup: ``` Db.Pool.get_all_records : N Min Max Median Avg Stddev x 432 54115.256 493532.78 56048.42 57937.384 24117.68 + 524 22642.778 333257.36 23595.495 24679.258 15206.708 Difference at 95.0% confidence -33258.1 +/- 2513.99 -57.4036% +/- 2.90374% (Student's t, pooled s = 19737.2) >>>> Db.VM.set_NVRAM : N Min Max Median Avg Stddev x 132 36794 2355369.4 1095736.6 1107222.1 298247.5 + 168 49167.417 1485278.1 678231.31 685636.89 161480.92 Difference at 95.0% confidence -421585 +/- 52835.6 -38.0759% +/- 3.54275% (Student's t, pooled s = 231767) ``` Signed-off-by: Edwin Török <[email protected]>
d5eef83
to
21c1e0d
Compare
Rebased, there are now only 10 commits. |
Signed-off-by: Edwin Török <[email protected]>
21c1e0d
to
b9082a8
Compare
@contificate could take a look |
Builds on top of #6461
Benchmark results are on individual commits.
The most significant one:
ministat
confirms a speedup: