Skip to content

Fix use-after-free in IOCP ASIO system#5091

Merged
SeanTAllen merged 1 commit intomainfrom
fix-iocp-token-toctou
Mar 30, 2026
Merged

Fix use-after-free in IOCP ASIO system#5091
SeanTAllen merged 1 commit intomainfrom
fix-iocp-token-toctou

Conversation

@SeanTAllen
Copy link
Copy Markdown
Member

@SeanTAllen SeanTAllen commented Mar 30, 2026

Two related use-after-free races in the IOCP token mechanism, both allowing the event to be freed while something still held a raw pointer to it.

Race 1 — callback vs destroy: pony_asio_event_destroy freed the event immediately via POOL_FREE regardless of whether IOCP callbacks were still in flight. A callback could check the token's dead flag, see the event alive, and then have the event freed and pool-recycled before accessing it.

Race 2 — message vs destroy (#5092): When a callback passes the dead check and sends a message to the owning actor, that message carries a raw asio_event_t* pointer. The callback then releases its refcount, which can free the event. The message is now in the actor's queue with a dangling pointer.

Fix: The token refcount now tracks all outstanding references to the event:

  • The event itself (refcount starts at 1 instead of 0)
  • Each in-flight IOCP operation (incremented on post, decremented on callback completion)
  • Each in-flight message (incremented in pony_asio_event_send, decremented after the actor's behavior dispatch returns in handle_message)

Destroy marks the token dead and releases the event's own reference. Whoever decrements to zero frees both the event and the token.

Surfaced by the Windows TCP open/close stress test (run) — release-compiled + --ponynoblock only.

Closes #5092

@SeanTAllen SeanTAllen added the changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge label Mar 30, 2026
@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Mar 30, 2026
@SeanTAllen SeanTAllen force-pushed the fix-iocp-token-toctou branch from 5c05591 to 3795e99 Compare March 30, 2026 14:39
@SeanTAllen
Copy link
Copy Markdown
Member Author

Also testing with ad-hoc windows tcp open/close against this branch:

https://github.com/ponylang/ponyc/actions/runs/23752133559

@SeanTAllen SeanTAllen changed the title Fix rare crash in Windows TCP stress tests caused by IOCP use-after-free Fix use-after-free in IOCP ASIO system Mar 30, 2026
@SeanTAllen SeanTAllen force-pushed the fix-iocp-token-toctou branch 3 times, most recently from 2c7ea9b to b1554b4 Compare March 30, 2026 15:51
@SeanTAllen
Copy link
Copy Markdown
Member Author

SeanTAllen commented Mar 30, 2026

Updated fix so I will kick off another ad-hoc stress test run:

https://github.com/ponylang/ponyc/actions/runs/23756220974

Two related races in the IOCP token mechanism allowed the event to be
freed while something still held a raw pointer to it.

Race 1 — callback vs destroy: pony_asio_event_destroy freed the event
immediately via POOL_FREE regardless of whether IOCP callbacks were
still in flight on thread pool threads. A callback could check the
token's dead flag, see the event alive, and then have the event freed
and pool-recycled before accessing it. The freed memory gets recycled
by the pool allocator, the callback reads garbage, and eventually the
scheduler jumps through a corrupted function pointer (DEP violation).

Race 2 — message vs destroy: when a callback passes the dead check and
calls pony_asio_event_send, the message carries a raw asio_event_t*
pointer. The callback then releases its refcount via iocp_destroy,
which can free the event if it's the last releaser after destroy marked
the token dead. The message is now in the actor's queue with a dangling
pointer. When the actor processes it, _event_notify dereferences freed
memory.

Fix: the token refcount now tracks all outstanding references to the
event, not just in-flight IOCP operations:

- The event itself: refcount starts at 1 (was 0). Destroy releases
  this reference instead of freeing the event directly.
- Each in-flight IOCP operation: incremented when posted, decremented
  when the callback completes (unchanged).
- Each in-flight message: incremented in pony_asio_event_send,
  decremented in handle_message after the behavior dispatch returns.

Whoever decrements the refcount to zero (whether destroy, the last
callback, or the last message dispatch) frees both the event and the
token. The event stays alive until every callback has finished and
every message has been processed.

The previous token mechanism (dead flag + refcount) was designed to
prevent exactly this class of bug but had TOCTOU flaws: checking dead
and then accessing the event was not atomic, and the message pointer
was not covered by the refcount at all.

Surfaced by the Windows TCP open/close stress test (run
https://github.com/ponylang/ponyc/actions/runs/23730792545/job/69123934918)
— release-compiled + --ponynoblock only, ~1 in 10 daily runs.

Closes #5092
@SeanTAllen SeanTAllen force-pushed the fix-iocp-token-toctou branch from b1554b4 to d484c48 Compare March 30, 2026 16:25
@SeanTAllen SeanTAllen merged commit acbeb7e into main Mar 30, 2026
16 checks passed
@SeanTAllen SeanTAllen deleted the fix-iocp-token-toctou branch March 30, 2026 17:37
@ponylang-main ponylang-main removed the discuss during sync Should be discussed during an upcoming sync label Mar 30, 2026
github-actions bot pushed a commit that referenced this pull request Mar 30, 2026
github-actions bot pushed a commit that referenced this pull request Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog - fixed Automatically add "Fixed" CHANGELOG entry on merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

IOCP late message can deliver dangling event pointer to actor after event freed

2 participants