Skip to content

Run containers attempt 3 #320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 73 commits into from
Jun 5, 2025
Merged

Conversation

lucascool12
Copy link
Contributor

@lucascool12 lucascool12 commented Apr 10, 2025

This PR continues on #66. My main goal is to move each part of the original branch to the new project layout, e.g. the run_store.rs file or whatever it should be called.

Each commit will move such a piece of code and also add tests for this (and then fix any resulting bugs).

Example of such a commit: a57aff1

Closes: #12

josephglanville and others added 30 commits September 11, 2020 17:34
Implements and tests `insert` and `insert_range` methods on runs.
This fixes some failing tests and adds some `#[allow(todo]` and
`#[allow(unused]`.
@lucascool12
Copy link
Contributor Author

I think inserting with insert_range (especially as a new container) should probably try to make a range container if it's a big enough range. It would be nice if bitmap.insert_range(0, HUGE_NUMBER) was efficent.

I have implemented this based on CRoaring's implementation in eff381a.

@Dr-Emann
Copy link
Member

I think the important factor is that:

Unless one calls (run_)optimize then all containers that can be represented more efficiently by run containers will be represented by run containers.

There are cases where a container can be represented equally efficiently as either a range, or an {array/bitmap}. Both implementations (correctly imo) default to leaving the existing container type when converting to/from a run container is not strictly more efficient.

Therefore, in these cases, the result of (run_)optimize/remove_run_compression depends on which of the two equally valid container types was already there.

e.g.

for the Roaring Bitmap containing [0, 1, 2], it could be represented in two ways

runs: [0..=2] # serialized size `2 + (1 * 4) = 6`
array: [0, 1, 2] # serialized size `3 * 2 = 6`

So e.g. both implementations have to match on the result type of container for all operations for all container types, e.g. run[0..=8] ^ array[3,4,5,6,7,8] needs to have the same container type in both implementations if we want to be able to guarantee they always serialize the same, even after doing (run_)optimize

@lucascool12
Copy link
Contributor Author

e.g.

for the Roaring Bitmap containing [0, 1, 2], it could be represented in two ways

runs: [0..=2] # serialized size `2 + (1 * 4) = 6`
array: [0, 1, 2] # serialized size `3 * 2 = 6`

So e.g. both implementations have to match on the result type of container for all operations for all container types, e.g. run[0..=8] ^ array[3,4,5,6,7,8] needs to have the same container type in both implementations if we want to be able to guarantee they always serialize the same, even after doing (run_)optimize

I see. I don't think it is feasible to ensure we also produces runs in the exact same situations as CRoaring.
CRoaring presumably doesn't make any promises about which operations automatically produce runs. so a minor version bump in CRoaring might make our fuzz ci fail, which is not desirable.

Relaxing the serialization comparison would be the best option we have.

@lucascool12
Copy link
Contributor Author

Couldn't we call remove_run_compression before (run_)opitimize to ensure we always have the same Roaring bitmap?
And also the other way around?

@lucascool12
Copy link
Contributor Author

Couldn't we call remove_run_compression before (run_)opitimize to ensure we always have the same Roaring bitmap? And also the other way around?

I ran the fuzzer with the following patch applied on croaring-rs and found nothing after letting it run for 45 minutes. Yeey!

diff --git i/croaring-sys/CRoaring/roaring.c w/croaring-sys/CRoaring/roaring.c
index d49cda5..ba61acb 100644
--- i/croaring-sys/CRoaring/roaring.c
+++ w/croaring-sys/CRoaring/roaring.c
@@ -1494,7 +1494,7 @@ bool array_container_validate(const array_container_t *v, const char **reason);
  * Return the serialized size in bytes of a container having cardinality "card".
  */
 static inline int32_t array_container_serialized_size_in_bytes(int32_t card) {
-    return card * 2 + 2;
+    return card * 2;
 }
 
 /**

@Kerollmops
Copy link
Member

Hey @lucascool12 and @Dr-Emann 👋

I hope you're good 😊 I was wondering if the final change we want to merge this PR is to merge RoaringBitmap/CRoaring#702? And if so, what's actually missing for it to be merged?

Have a nice day 🥬

@lucascool12
Copy link
Contributor Author

Hey @lucascool12 and @Dr-Emann 👋

I hope you're good 😊 I was wondering if the final change we want to merge this PR is to merge RoaringBitmap/CRoaring#702? And if so, what's actually missing for it to be merged?

Have a nice day 🥬

I noticed that Interval assumes self.start <= self.end but this is very weakly enforced right now. I'll change this by making the new function return an option and add a new_unchecked. Lastly, I'm going to review my own code one more time, resolve anything I find that is unsatisfactory. And then this PR will be completely ready from my end.

Also I think we are all in favour of the current semantics of optimize even though it is different from croaring's run_optimize, correct? And as @Dr-Emann said since optimize didn't exist previously adding a breaking change in this PR is a bit odd. Maybe we should remove the breaking label?

@Dr-Emann
Copy link
Member

Did find something in fuzzing:

Fuzz input
FuzzInput {
    ops: [
        MutateLhs(
            Extend(
                [
                    Num(
                        97619,
                    ),
                    Num(
                        97917,
                    ),
                    Num(
                        97661,
                    ),
                    Num(
                        77184,
                    ),
                    Num(
                        72989,
                    ),
                    Num(
                        70941,
                    ),
                    Num(
                        104237,
                    ),
                ],
            ),
        ),
        SwapSides,
        MutateLhs(
            InsertRange(
                Num(
                    72981,
                )..=Num(
                    72989,
                ),
            ),
        ),
        Binary(
            Xor,
        ),
        Binary(
            Or,
        ),
        MutateLhs(
            RemoveRunCompression,
        ),
    ],
    initial_input: [],
}

Base64: A319fX3Fl4eDfX19U1N9fn19fX19fX0tgB0VHR2NHRUdHcWXLYAdFR0dxZd9fX19fS2AHRUdHY0dFR0BAAAAHR1TyUEABR0VHR3Fl5eXl5dw5VNTyTA=

Looking a bit closer at https://github.com/lucascool12/roaring-rs/blob/c3ebe863e377b58a0732f0ba27da13dc8a1b987f/fuzz/fuzz_targets/arbitrary_ops/mod.rs#L280-L282

x.run_optimize();
y.optimize();
assert_eq!(x.remove_run_compression(), y.remove_run_compression());

I don't think we can do that assert: If we've got a bitmap that can be either a bitmap or {array/bitmap}, the optimize call won't do anything, e.g. croaring could have a run container, roaring could have an array container, so removing runs will return true for croaring, false for roaring.

Think we could either just not check the return values, or we could use the statistics call to check if the type of containers have changed, rather than comparing with if the croaring bitmap changed.

@lucascool12
Copy link
Contributor Author

Think we could either just not check the return values, or we could use the statistics call to check if the type of containers have changed, rather than comparing with if the croaring bitmap changed.

So using a statistics call before and after and then checking no run containers exist?

I tried adding x.remove_run_compression(); and y.remove_run_compression(); before the optimize calls, this works for this crash. And unless I'm missing something this should always result in the same result no?

@Kerollmops Kerollmops removed the breaking This change will require a bump of the minor or major version. label May 18, 2025
@Kerollmops
Copy link
Member

Kerollmops commented May 27, 2025

I tried adding x.remove_run_compression(); and y.remove_run_compression(); before the optimize calls, this works for this crash. And unless I'm missing something this should always result in the same result no?

@lucascool12 Do you think this change can be part of the final PR or should we implement the statistic-based solution?

What I don't like/understand with the remove-run-compression solution is that it doesn't check the run-container optimization. At least, doesn't compare it to the C implementation. Am I wrong?

@Kerollmops
Copy link
Member

Hey @lucascool12 👋

Daniel just merged the PR on the C roaring library. I think we need the croaring Rust wrapper to update its dependency and we will be ready to merge this very PR 👏

@lucascool12
Copy link
Contributor Author

I tried adding x.remove_run_compression(); and y.remove_run_compression(); before the optimize calls, this works for this crash. And unless I'm missing something this should always result in the same result no?

@lucascool12 Do you think this change can be part of the final PR or should we implement the statistic-based solution?

What I don't like/understand with the remove-run-compression solution is that it doesn't check the run-container optimization. At least, doesn't compare it to the C implementation. Am I wrong?

Well, it only checks that both implementations agree that the bitmap changed or stayed the same.
A later operation could then check if the statistics are the same, this depends on what the fuzzer decides.
We could check the statistics right after the optimization if we really wanted, I don't think it makes much of a difference.

Daniel just merged the PR on the C roaring library. I think we need the croaring Rust wrapper to update its dependency and we will be ready to merge this very PR 👏

Great! I'll push the last remnants such as the remove_compression and interval change. Then this PR will be ready.

Fixes a fuzz failure by ensuring no run containers are present in both
implementations before adding run containers and then removing them
again to check if both remove operations had the same effect.
@Dr-Emann
Copy link
Member

Dr-Emann commented Jun 1, 2025

New version of croaring-sys which picks up the croaring update, should just need a cargo update in the fuzzer directory to pick it up.

Copy link
Contributor Author

@lucascool12 lucascool12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright this is my final review of this PR. I'll remove the dbg! statements and push the updated fuzz code in 5 seconds.

Presumably we would also want to revert the change to the Debug impl for RoaringBitmap, but I'd like some confirmation for this.

@lucascool12
Copy link
Contributor Author

Is there anything left to be done from my end to get this merged?
In my humble opinion this PR is ready to merge.
Or does it require review from someone in particular before being merged?

@Kerollmops
Copy link
Member

Hey @lucascool12 👋

Thank you again for the good work. I'll merge it right away your mission is a success. I plan to release a new version soon enough either before or after trying it on Meilisearch 🤔

@Kerollmops Kerollmops added this pull request to the merge queue Jun 5, 2025
Merged via the queue into RoaringBitmap:main with commit 6535a82 Jun 5, 2025
15 checks passed
@Dr-Emann
Copy link
Member

Dr-Emann commented Jun 7, 2025

I've been running the fuzzer for a few days, and no findings!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for run containers
4 participants