Skip to content

Isolate MiriMachine memory from Miri's #4343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

nia-e
Copy link
Contributor

@nia-e nia-e commented May 22, 2025

Based on discussion surrounding #4326, this merges in the (very simple) discrete allocator that the MiriMachine will have. See the design document linked there there for considerations, but in brief: we could pull in an off the shelf allocator for this, but performance isn't a massive worry and doing it this way might make it easier to enable support for doing multi-seeded runs in the future (without a lot more extraneous plumbing, at least)

@nia-e
Copy link
Contributor Author

nia-e commented May 22, 2025

@rustbot ready

@rustbot rustbot added the S-waiting-on-review Status: Waiting for a review to complete label May 22, 2025
@nia-e nia-e force-pushed the discrete-allocator branch 3 times, most recently from 1520f81 to 6cbc283 Compare May 23, 2025 10:36
@nia-e nia-e force-pushed the discrete-allocator branch from 6cbc283 to b53ed38 Compare May 23, 2025 10:39
Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I left some first comments, but this is not a full review. I'd rather not reverse-engineer the invariants of MachineAlloc myself, so I'll wait for you to document them, which will make review a lot easier.

Furthermore, all pub fn in discrete_alloc should have proper doc comments, not just a safety comment. Please also add some basic unit tests -- we don't use them much in Miri, but this is one of the cases where they would make sense.

On Zulip you mentioned some benchmarks. Can you put benchmark results for the variant that you ended up going for here into the PR?

Co-authored-by: Ralf Jung <[email protected]>
@RalfJung
Copy link
Member

@rustbot author

@rustbot rustbot removed the S-waiting-on-review Status: Waiting for a review to complete label May 24, 2025
@rustbot
Copy link
Collaborator

rustbot commented May 24, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@rustbot rustbot added the S-waiting-on-author Status: Waiting for the PR author to address review comments label May 24, 2025
@nia-e
Copy link
Contributor Author

nia-e commented May 24, 2025

I'll post benchmarks in a bit! I realised there might be some speed gains to be made with very simple changes, so I'll just experiment a little first. Thanks for the comments ^^

@nia-e nia-e force-pushed the discrete-allocator branch from 4b1f50f to 9f1047e Compare May 24, 2025 15:13
@nia-e
Copy link
Contributor Author

nia-e commented May 24, 2025

Baseline is set to having the allocator fully disabled. It's only marginally slower in most cases, though it struggles with large allocations it seems. I wonder how much work it would be to improve that, but if we go down the "only machines using this will touch it" path I hope it's not too bad? I got a slight (~4%) improvement from calling mmap directly instead of calling alloc::alloc() but I doubt that's worth it. Is Miri built with optimisations on by default when invoking ./miri bench?

Comparison with baseline (relative speed, lower is better for the new results):
  backtraces: 1.36 ± 0.08
  big-allocs: 32.71 ± 1.60
  mse: 1.05 ± 0.10
  range-iteration: 1.06 ± 0.03
  serde1: 1.00 ± 0.03
  serde2: 1.06 ± 0.03
  slice-chunked: 1.03 ± 0.04
  slice-get-unchecked: 0.93 ± 0.08
  string-replace: 1.06 ± 0.04
  unicode: 1.06 ± 0.05
  zip-equal: 1.08 ± 0.02

@RalfJung
Copy link
Member

Is Miri built with optimisations on by default when invoking ./miri bench?

Yes.

A 32x slowdown with big allocations is hefty.^^ Shouldn't those just forward to alloc::alloc anyway as part of the huge allocs treatment? Without looking at the code, what I'd assume happens when the allocation request is close to the page size is: round up to multiple of page size, and then just allocate that and use it directly without any kind of tracking of "which parts of this page are used" or so. That should have basically no overhead.

@nia-e
Copy link
Contributor Author

nia-e commented May 24, 2025

It does mostly do that, which is what's confusing me... I'll try to fix it, I assume I just missed something really obvious.

@nia-e
Copy link
Contributor Author

nia-e commented May 24, 2025

I checked; seems like the big-allocs test specifically calls alloc_zeroed which was the reason for the slowdown (apparently ptr.write_bytes() is a really slow way to zero out bytes). I changed up the logic a bit to make the logic generic over calling alloc::alloc() vs alloc::alloc_zeroed() and these are the (much better!) results:

Comparison with baseline (relative speed, lower is better for the new results):
  backtraces: 1.23 ± 0.06
  big-allocs: 1.03 ± 0.05
  mse: 0.99 ± 0.09
  range-iteration: 1.01 ± 0.03
  serde1: 0.99 ± 0.02
  serde2: 1.03 ± 0.02
  slice-chunked: 0.97 ± 0.04
  slice-get-unchecked: 0.89 ± 0.08
  string-replace: 1.00 ± 0.03
  unicode: 1.01 ± 0.04
  zip-equal: 1.05 ± 0.04

I might be able to squeeze a bit more perf out by actually making the functions generic instead of just passing in a function pointer but shrug, unsure if it's necessary

@RalfJung
Copy link
Member

Ah yes, that is exactly why we added that particular benchmark. :)

@nia-e
Copy link
Contributor Author

nia-e commented May 24, 2025

What kind of unit tests do you think belong here? I assumed functionality is covered by the usual tests, but I'll happily add in some stuff if you think it's relevant

@RalfJung
Copy link
Member

What kind of unit tests do you think belong here? I assumed functionality is covered by the usual tests, but I'll happily add in some stuff if you think it's relevant

Similar to range_map: just add some functions with #[test] in the file with the allocator implementation to test some particular corner cases -- or at least a basic smoke test if there are no corner cases.

@nia-e
Copy link
Contributor Author

nia-e commented May 24, 2025

Openen the PR on the main repo, I'll get to adding tests if everything there is okay it is not okay I need to do more things oops

@nia-e
Copy link
Contributor Author

nia-e commented May 24, 2025

Expecting the build to fail for now since it's adapted to the changes from the PR (but also Miri seems to be having trouble on the current upstream master commit, so I guess it's pending that being fixed too)

@nia-e
Copy link
Contributor Author

nia-e commented May 24, 2025

Tests added :D let me know if there's anything more to do

@nia-e nia-e force-pushed the discrete-allocator branch from 87f2b3f to f7fe286 Compare May 24, 2025 22:52
@rustbot
Copy link
Collaborator

rustbot commented May 25, 2025

☔ The latest upstream changes (possibly #4349) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: Waiting for the PR author to address review comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants