-
Notifications
You must be signed in to change notification settings - Fork 90
Run containers attempt 3 #320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Implements and tests `insert` and `insert_range` methods on runs.
This fixes some failing tests and adds some `#[allow(todo]` and `#[allow(unused]`.
I have implemented this based on CRoaring's implementation in eff381a. |
I think the important factor is that:
There are cases where a container can be represented equally efficiently as either a range, or an {array/bitmap}. Both implementations (correctly imo) default to leaving the existing container type when converting to/from a run container is not strictly more efficient. Therefore, in these cases, the result of e.g. for the Roaring Bitmap containing [0, 1, 2], it could be represented in two ways
So e.g. both implementations have to match on the result type of container for all operations for all container types, e.g. |
I see. I don't think it is feasible to ensure we also produces runs in the exact same situations as CRoaring. Relaxing the serialization comparison would be the best option we have. |
Couldn't we call |
I ran the fuzzer with the following patch applied on croaring-rs and found nothing after letting it run for 45 minutes. Yeey! diff --git i/croaring-sys/CRoaring/roaring.c w/croaring-sys/CRoaring/roaring.c
index d49cda5..ba61acb 100644
--- i/croaring-sys/CRoaring/roaring.c
+++ w/croaring-sys/CRoaring/roaring.c
@@ -1494,7 +1494,7 @@ bool array_container_validate(const array_container_t *v, const char **reason);
* Return the serialized size in bytes of a container having cardinality "card".
*/
static inline int32_t array_container_serialized_size_in_bytes(int32_t card) {
- return card * 2 + 2;
+ return card * 2;
}
/** |
Hey @lucascool12 and @Dr-Emann 👋 I hope you're good 😊 I was wondering if the final change we want to merge this PR is to merge RoaringBitmap/CRoaring#702? And if so, what's actually missing for it to be merged? Have a nice day 🥬 |
I noticed that Also I think we are all in favour of the current semantics of |
Did find something in fuzzing: Fuzz input
Base64: Looking a bit closer at https://github.com/lucascool12/roaring-rs/blob/c3ebe863e377b58a0732f0ba27da13dc8a1b987f/fuzz/fuzz_targets/arbitrary_ops/mod.rs#L280-L282 x.run_optimize();
y.optimize();
assert_eq!(x.remove_run_compression(), y.remove_run_compression()); I don't think we can do that assert: If we've got a bitmap that can be either a bitmap or {array/bitmap}, the Think we could either just not check the return values, or we could use the |
So using a statistics call before and after and then checking no run containers exist? I tried adding |
@lucascool12 Do you think this change can be part of the final PR or should we implement the statistic-based solution? What I don't like/understand with the remove-run-compression solution is that it doesn't check the run-container optimization. At least, doesn't compare it to the C implementation. Am I wrong? |
Hey @lucascool12 👋 Daniel just merged the PR on the C roaring library. I think we need the croaring Rust wrapper to update its dependency and we will be ready to merge this very PR 👏 |
Well, it only checks that both implementations agree that the bitmap changed or stayed the same.
Great! I'll push the last remnants such as the remove_compression and interval change. Then this PR will be ready. |
Fixes a fuzz failure by ensuring no run containers are present in both implementations before adding run containers and then removing them again to check if both remove operations had the same effect.
8b41f22
to
69fe5e6
Compare
New version of croaring-sys which picks up the croaring update, should just need a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright this is my final review of this PR. I'll remove the dbg!
statements and push the updated fuzz code in 5 seconds.
Presumably we would also want to revert the change to the Debug
impl for RoaringBitmap
, but I'd like some confirmation for this.
b16e44f
to
5427897
Compare
Is there anything left to be done from my end to get this merged? |
Hey @lucascool12 👋 Thank you again for the good work. I'll merge it right away your mission is a success. I plan to release a new version soon enough either before or after trying it on Meilisearch 🤔 |
I've been running the fuzzer for a few days, and no findings! |
This PR continues on #66. My main goal is to move each part of the original branch to the new project layout, e.g. the
run_store.rs
file or whatever it should be called.Each commit will move such a piece of code and also add tests for this (and then fix any resulting bugs).
Example of such a commit: a57aff1
Closes: #12