Skip to content

Race condition: assertion failure in stream_engine_base.cpp: out_event() when _io_error is set #4841

@p4l1ly

Description

@p4l1ly

Problem:
With ZMQ_HEARTBEAT_IVL and ZMQ_HEARTBEAT_TIMEOUT enabled (e.g. ROUTER/DEALER over TCP), the process can crash with:

Assertion failed: !_io_error (src/stream_engine_base.cpp:316)

Cause: in_event_internal() sets _io_error = true and removes the fd from the poll set when the receive pipe hits backpressure (e.g. RCVHWM) or on other input-stop paths. The I/O thread’s poller can still deliver a POLLOUT and call out_event() before the engine is torn down. out_event() asserts !_io_error, so the process aborts. This is a race between teardown and a stale/speculative out_event callback.

Reproduction is more likely when the application stops reading from the socket (e.g. under load or in a “stuck” state): the receive pipe fills, backpressure sets _input_stopped then _io_error, and the poller may still invoke out_event().

Solution:
In stream_engine_base.cpp, in out_event(), replace the assert with an early return so that when _io_error is already set we no-op and let teardown proceed:

void zmq::stream_engine_base_t::out_event ()
{
    if (_io_error)
        return;
    // ... rest unchanged
}

(Remove the line zmq_assert (!_io_error);.) Whenever _io_error is true, the correct behavior is to not run the rest of out_event(); the assert was an invariant that this race violates.

Environment:

Steps to reproduce:
See #4364 (PUB/SUB with small heartbeat timeout). In our case: ROUTER with ZMQ_HEARTBEAT_IVL and ZMQ_HEARTBEAT_TIMEOUT set; multiple DEALER clients; stop calling recv on the ROUTER for several seconds (simulating overload). The receive pipe hits HWM, backpressure triggers the path that sets _io_error, and the assertion in out_event() can fire.

Expected result:
No crash. When _io_error is set, out_event() should return immediately; teardown continues without aborting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions