Skip to content

Conversation

@rwatson
Copy link
Member

@rwatson rwatson commented Dec 19, 2025

This is a very early draft of requirements / guidance for memory-allocator APIs, for discussion / feedback / improvement.

It might be that we want to broaden it a little bit to also mention stack allocation using alloca(), and to add some specific words that better enable garbage collection / allow GC-enabled heap allocators to conform. And, more generally, to consider allocators that are not the C standard allocator.

Copy link

@gvnn3 gvnn3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I at least think I understand it. I'd like to get this in soon so I can share it with an external consumer.

* Is unsealed
* Has bounds that permit access to the full requested range of the allocation
* Has bounds that do not permit access to any other current allocation,
implementing non-aliasing spatial safety
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This includes any allocator metadata about the current allocation" or such, perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now added.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this but generalised in a couple of ways (1) On realloc() make sure to clear newly available memory and (2) Don't allow access to any allocator metadata, whether for the current allocation or otherwise. Check to make sure you are happy with the revised words, please!


The allocator may implement fail-stop semantics if the call fails for one or
more of the above reasons.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably safe to leave up to the allocator whether a nonzero offset between address and base is acceptable.

The CHERIoT allocator overloads free() (well, heap_free()) to permit dropping claims on an object, and in that case it accepts interior, subset, &| attenuated pointers. That is, claims may be (taken and) dropped on subobjects (or, really, arbitrary slices of objects) but deallocation requires the original pointer (a la C).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the former: Are there use cases for non-zero offsets in heap allocators we are aware of beyond, perhaps, "deterministically trap on overflow rather than on underflow when imprecise bounds require padding"?


The allocator must not return a capability that:

* Has the same integer address as the passed argument but different bounds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generally, may not overlap the passed-in object. (It's equally bad to move the base down as it is to move the limit up). This probably deserves some expansion: we don't even allow returning the same object in the case of a realloc-to-be-smaller because we want to treat each realloc result as having its own lifetime, as if we'd moved the object, regardless of whether actually we moved the object or not.

But I think we actually don't want this constraint in a CHERI+MTE world: I think we can avoid moving the object so long as reallocation is within the original bounds (that is, we can make it smaller then take it back to its original size) (ncolors-1)/ncolors times by repainting, which is what we'd do if we were to move the object anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the reason for adding this limitation was also about the fact that we define pointer comparison as considering only integer addresses, so if you return a pointer with different bounds, the caller cannot use an equality check to differentiate the two, and in common use will likely use the original pointer and not the one with modified bounds. Clearly we need to add some rationale notes, but .. does that change your view on the CHERI+MTE case?

The allocator must not:

* Reuse storage associated with the allocation until there are no outstanding
valid capabilities that authorize access to the memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised to see this separately from the revocation point below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was that I wanted to avoid an assumption of revocation in the 'must' / 'must not' text, and simply state the invariants, leaving open the possibility of other techniques than revocation -- perhaps GC-related ones. I guess this is in the free() section, which is hardly a GC-centric API. But does the general argument for putting the invariant rather than the mechanism work from your perspective -- and, if so, is it the right invariant?

The allocator may:

* Fill reachable memory within the bounds of the allocation with zeroes after
it has been freed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, if revocation is not prompt (CHERIoT, MTE), these zeros are advisory (and the allocator should re-zero memory before returning it from malloc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants