Skip to content

Publisher / DataWriter write() blocks when subscriber joins a topic. max_blocking_time is violated despite STRICT_REALTIME. #3566

Open
@briansoe66

Description

@briansoe66

Is there an already existing issue for this?

  • I have searched the existing issues

Expected behavior

A publisher (DataWriter) configured with non-blocking QOS and compiled with STRICT_REALTIME definition, should block for max_blocking_time at most, during write().

Current behavior

A publisher (DataWriter) configured with non-blocking QOS and compiled with STRICT_REALTIME definition, blocks for up to 20 ms during write(), when a new subscriber joins the topic.

Steps to reproduce

  • compile with -DSTRICT_REALTIME
  • configure Publisher / DataWriter with max_blocking_time
  • create Subscriber
  • continuously call DataWriter::write(), and record execution time
  • create subscriber while Publisher / DataWriter is running
eprosima::fastdds::dds::Publisher* publisher = my_participant->create_publisher(eprosima::fastdds::dds::PUBLISHER_QOS_DEFAULT);
auto qos = writer->get_default_datawriter_qos();
qos.history().kind = eprosima::fastrtps::KEEP_LAST_HISTORY_QOS;
qos.endpoint().history_memory_policy = eprosima::fastrtps::rtps::PREALLOCATED_MEMORY_MODE;
qos.reliability().kind = eprosima::fastrtps::BEST_EFFORT_RELIABILITY_QOS;
qos.reliability().max_blocking_time.seconds = 0;
qos.reliability().max_blocking_time.nanosec = 50000;  // 50 microsecond
qos.durability().kind = eprosima::fastrtps::VOLATILE_DURABILITY_QOS;
qos.publish_mode().kind = eprosima::fastrtps::ASYNCHRONOUS_PUBLISH_MODE;
qos.data_sharing().on("");
eprosima::fastdds::dds::DataWriter* writer = publisher->create_datawriter(my_topic, qos);

Fast DDS version/commit

2.6.X

Platform/Architecture

Other. Please specify in Additional context section.

Transport layer

UDPv4, Shared Memory Transport (SHM), Zero copy

Additional context

Platform: QNX7

This may be caused by FlowControllerImpl::enqueue_new_sample_impl() not implementing the max_blocking_time
https://github.com/eProsima/Fast-DDS/blob/master/src/cpp/rtps/flowcontrol/FlowControllerImpl.hpp

enqueue_new_sample_impl(
            fastrtps::rtps::RTPSWriter* writer,
            fastrtps::rtps::CacheChange_t* change,
            const std::chrono::time_point<std::chrono::steady_clock>& /* TODO max_blocking_time*/)

Will replacing the mutex lock with RecursiveTimedMutex try_lock_until() fix this issue?

FlowControllerImpl::enqueue_new_sample_impl(
            fastrtps::rtps::RTPSWriter* writer,
            fastrtps::rtps::CacheChange_t* change,
            const std::chrono::time_point<std::chrono::steady_clock>& max_blocking_time)
    {
        assert(!change->writer_info.is_linked.load());
        // Sync delivery failed. Try to store for asynchronous delivery.
        std::unique_lock<RecursiveTimedMutex> lock(async_mode.changes_interested_mutex, std::defer_lock);
        if (lock.try_lock_until(max_blocking_time))
        {
            sched.add_new_sample(writer, change);
            async_mode.cv.notify_one();
        }
        return true;
    }

XML configuration file

No response

Relevant log output

No response

Network traffic capture

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue to report a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions