Fix use-after-free in IOCP ASIO system#5091
Merged
SeanTAllen merged 1 commit intomainfrom Mar 30, 2026
Merged
Conversation
5c05591 to
3795e99
Compare
Member
Author
|
Also testing with ad-hoc windows tcp open/close against this branch: |
2c7ea9b to
b1554b4
Compare
Member
Author
|
Updated fix so I will kick off another ad-hoc stress test run: |
Two related races in the IOCP token mechanism allowed the event to be freed while something still held a raw pointer to it. Race 1 — callback vs destroy: pony_asio_event_destroy freed the event immediately via POOL_FREE regardless of whether IOCP callbacks were still in flight on thread pool threads. A callback could check the token's dead flag, see the event alive, and then have the event freed and pool-recycled before accessing it. The freed memory gets recycled by the pool allocator, the callback reads garbage, and eventually the scheduler jumps through a corrupted function pointer (DEP violation). Race 2 — message vs destroy: when a callback passes the dead check and calls pony_asio_event_send, the message carries a raw asio_event_t* pointer. The callback then releases its refcount via iocp_destroy, which can free the event if it's the last releaser after destroy marked the token dead. The message is now in the actor's queue with a dangling pointer. When the actor processes it, _event_notify dereferences freed memory. Fix: the token refcount now tracks all outstanding references to the event, not just in-flight IOCP operations: - The event itself: refcount starts at 1 (was 0). Destroy releases this reference instead of freeing the event directly. - Each in-flight IOCP operation: incremented when posted, decremented when the callback completes (unchanged). - Each in-flight message: incremented in pony_asio_event_send, decremented in handle_message after the behavior dispatch returns. Whoever decrements the refcount to zero (whether destroy, the last callback, or the last message dispatch) frees both the event and the token. The event stays alive until every callback has finished and every message has been processed. The previous token mechanism (dead flag + refcount) was designed to prevent exactly this class of bug but had TOCTOU flaws: checking dead and then accessing the event was not atomic, and the message pointer was not covered by the refcount at all. Surfaced by the Windows TCP open/close stress test (run https://github.com/ponylang/ponyc/actions/runs/23730792545/job/69123934918) — release-compiled + --ponynoblock only, ~1 in 10 daily runs. Closes #5092
b1554b4 to
d484c48
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two related use-after-free races in the IOCP token mechanism, both allowing the event to be freed while something still held a raw pointer to it.
Race 1 — callback vs destroy:
pony_asio_event_destroyfreed the event immediately viaPOOL_FREEregardless of whether IOCP callbacks were still in flight. A callback could check the token's dead flag, see the event alive, and then have the event freed and pool-recycled before accessing it.Race 2 — message vs destroy (#5092): When a callback passes the dead check and sends a message to the owning actor, that message carries a raw
asio_event_t*pointer. The callback then releases its refcount, which can free the event. The message is now in the actor's queue with a dangling pointer.Fix: The token refcount now tracks all outstanding references to the event:
pony_asio_event_send, decremented after the actor's behavior dispatch returns inhandle_message)Destroy marks the token dead and releases the event's own reference. Whoever decrements to zero frees both the event and the token.
Surfaced by the Windows TCP open/close stress test (run) — release-compiled +
--ponynoblockonly.Closes #5092