Remove a branch from `try_alloc_layout`?

After reading fitzgen's (very interesting) [blog post](https://fitzgeraldnick.com/2019/11/01/always-bump-downwards.html) about the rationale for bumping downwards, I had one thought:

`try_alloc_layout_fast` has 2 branches:

https://github.com/fitzgen/bumpalo/blob/bb660a398d3af301b76c55b1cbc099c92f65688b/src/lib.rs#L1414-L1442

The first branch `(ptr as usize) < layout.size()` is there purely to ensure that `ptr.wrapping_sub(layout.size())` cannot wrap around. This rules out a possible mistake when evaluating the 2nd branch condition `aligned_ptr >= start`.

Bumpalo already has a method [Bump::set_allocation_limit](https://docs.rs/bumpalo/latest/bumpalo/struct.Bump.html#method.set_allocation_limit) to limit the size of the `Bump`. I imagine that most users could impose a limit on the size of their `Bump`s. It'd be an uncommon use case for a bump allocator to be allocating massive slabs of memory, as they'd probably also be long-lived.

My thinking is this:

Taking example where size limit is 4 GiB minus 1 byte (i.e. `size <= u32::MAX`):

If the total size of the bump is constrained to 4 GiB, then no single allocation can be larger than 4 GiB. So `layout.size()` of a successful allocation is always a valid `u32`.

Constrain `T` in `fn alloc<T>(&self, val: T)` to only allow types where `mem::size_of::<T>() <= u32::MAX`.

When `Bump` allocates a chunk from global allocator, request a chunk of 4 GiB size. If my understanding is correct, this will only consume 4 GiB of *virtual* memory, not physical memory (though I may be wrong about that, in which case my whole theory here collapses!)

Check the start pointer for that chunk satisfies `start_ptr as usize > u32::MAX as usize`. In unlikely event that it doesn't:

* Allocate *another* 4 GiB chunk.
* Because allocations can't overlap, the pointer to the 2nd allocation is guaranteed to be `> u32::MAX`.
* Free the 1st allocation, and use the 2nd for the chunk.

Either way, we now have a guarantee that `start_ptr > u32::MAX`.

`Bump::alloc<T>` can use a specialized version of `alloc_layout` where `layout.size()` is statically constrained to be `<= u32::MAX`.

Combining these 2 guarantees means that `(ptr as usize) < layout.size()` can never be true, and that branch can be removed. `ptr.wrapping_sub(layout.size())` can never wrap.

NB: A size check would still be required when allocating `&[T]`, as size is not knowable statically. But nonetheless, at least making `Bump::alloc` a bit faster would probably be a worthwhile gain.

NB 2: Some of the above is a little approximate (maybe I'm conflating 4 GiB and 4 GiB - 1 in some places), but hopefully the general idea is clear enough.

Do you think this would work? And if so, would it be a worthwhile optimization?

	fn try_alloc_layout_fast(&self, layout: Layout) -> Option<NonNull<u8>> {
	// We don't need to check for ZSTs here since they will automatically
	// be handled properly: the pointer will be bumped by zero bytes,
	// modulo alignment. This keeps the fast path optimized for non-ZSTs,
	// which are much more common.
	unsafe {
	let footer = self.current_chunk_footer.get();
	let footer = footer.as_ref();
	let ptr = footer.ptr.get().as_ptr();
	let start = footer.data.as_ptr();
	debug_assert!(start <= ptr);
	debug_assert!(ptr as const u8 <= footer as const _ as *const u8);

	if (ptr as usize) < layout.size() {
	return None;
	}

	let ptr = ptr.wrapping_sub(layout.size());
	let aligned_ptr = round_mut_ptr_down_to(ptr, layout.align());

	if aligned_ptr >= start {
	let aligned_ptr = NonNull::new_unchecked(aligned_ptr);
	footer.ptr.set(aligned_ptr);
	Some(aligned_ptr)
	} else {
	None
	}
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove a branch from `try_alloc_layout`? #234

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Remove a branch from try_alloc_layout? #234

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Remove a branch from `try_alloc_layout`? #234