Skip to content

Subscriber does not include locators for new interfaces in the list of subscriptions [14659] #2676

Open
@jsnykan

Description

@jsnykan

Is there an already existing issue for this?

  • I have searched the existing issues

Expected behavior

When using dynamic network interfaces feature (https://fast-dds.docs.eprosima.com/en/latest/fastdds/use_cases/dynamic_network_interfaces/dynamic_network_interfaces.html) the Fast-DDS participant should successfully subscribe to and receive data from a publisher via a newly added network interface.

Current behavior

The Fast-DDS participants see each other, but the subscriber will not receive topic messages from the publisher as the subscriber does not include a locator for the newly added interface in the subscription list.

Steps to reproduce

Using project attached (example_project.tar.gz).

example_project.tar.gz

Project description:

Publisher (pub executable) publishes one topic ("HelloWorldTopic") which includes uint32_t as a data, and publishes data to this topic with 100ms frequency using DataWriter. Also calls DomainParticipant::set_qos() with 100ms frequency.

Subscriber (sub executable) subscribes to one topic ("HelloWorldTopic", the same topic pub has published) and tries periodically to read topic data with DataReader. If there is no data, then it calls DomainParticipant::set_qos() and Subscriber::set_qos().
(Calling Subscriber::set_qos() is part of the hack I describe in "Additional context" section - this issue occurs without that call as well.)

I have the following test setup:

Running publisher (pub executable) on computer1 (ip address 192.168.20.160)
Running subscriber (sub executable) on computer2 (ip address 192.168.20.12)
computer1 and computer2 network cables are connected in a same dumb switch

Steps to reproduce:

  1. run pub executable on computer1
  2. disconnect network cable from computer2
  3. run sub executable on computer2
  4. reconnect network cable to computer 2

Note that if pub and sub are switched to different computers so that pub is run on computer2 and sub on computer1 and similar steps are run (network cable is disconnected+reconnected on computer2), then the feature seems to work as documented without any modifications needed to Fast-DDS library (and no extra Subscriber::set_qos() call needed, either).

Fast DDS version/commit

Same problem exists with both v2.6.0 and master (c8a9f19)

Platform/Architecture

Ubuntu Focal 20.04 amd64

Transport layer

UDPv4

Additional context

I captured network traffic while doing the steps listed in "steps to reproduce". This network capture is attached to this issue (tcpdump_without_fix.zip).

tcpdump_without_fix.zip

Based on that capture it looks to me that computer2 (subscriber, 192.168.20.12) is sending participant info (DATA(p)) to computer1 (publisher, 192.168.20.160) in it which includes the ip address of the new interface (192.168.20.12) in the list of unicast locators (capture packet 2). However, later when computer2 sends its subscriptions (DATA(r)) to computer1 it does not send that ip address in the list of unicast locators (capture packet 12). When I noticed this I checked where those ip addresses might come from, and it seems that e.g. ReaderProxyData::writeToCDRMessage() is called only on initialization time, and when computer2 does not have an interface with address 192.168.20.12 at that time present, then it is not written to the CDR message.

Then I investigated what happens when DomainParticipant::set_qos() is called, and noticed that while RTPSParticipantImpl::update_attributes() updates a lot of things, it does not touch objects in m_userWriterList and m_userReaderList. I added my own code which calls RTPSParticipantImpl::createAndAssociateReceiversWithEndPoint() function for those mimicing the way it is called when readers and writers are created, and this seems to work if I also call Subscriber::set_qos() in addition to calling DomainParticipant::set_qos() in my test program (sub executable). I have attached my hack for RTPSParticipantImpl class (eprosima-dynamic-networks-hack.zip) in this issue as well as a new tcpdump capture with hack included (tcpdump_with_fix.zip). I am quite certain that my hack is not the correct way to fix this, but it seems to work at least with my simple test project.

eprosima-dynamic-networks-hack.zip
tcpdump_with_fix.zip

XML configuration file

No response

Relevant log output

No response

Network traffic capture

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue to report a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions