Skip to content

DDS-related processes crash after system time adjustment #3836

Open
@xjzer

Description

@xjzer

Is there an already existing issue for this?

  • I have searched the existing issues

Expected behavior

Expected normal operation of the process

Current behavior

1. After the controller is powered up, the DDS-related processes start normally, and then the system time is adjusted, at which point the processes may crash occasionally.
2. The results of gdb parsing the core file show that both crash the program after an exception is thrown by the destructor ~RTPSMessageGroup().
3. After removing try/catch/throw from the destructor ~RTPSMessageGroup(), and after a couple of problems, the gbd backtrace is as follows.

  • delete try catch
RTPSMessageGroup::~RTPSMessageGroup() noexcept(false)
{
    //try
    //{
        send();
    //}
    //catch (...)
    //{
    //    if (!internal_buffer_)
    //   {
    //         participant_->return_send_buffer(std::move(send_buffer_));
    //     }
    //     throw;
    // }

    if (!internal_buffer_)
    {
        participant_->return_send_buffer(std::move(send_buffer_));
    }
}
  • add log
std::chrono::time_point<std::chrono::steady_clock> start_steady_clock = std::chrono::steady_clock::now();
std::chrono::time_point<std::chrono::system_clock> start_system_clock = std::chrono::system_clock::now();
if (!sender_->send(msgToSend, max_blocking_time_is_set_ ? max_blocking_time_point_
                                                        : (std::chrono::steady_clock::now() + std::chrono::hours(24))))
{
    std::chrono::time_point<std::chrono::steady_clock> end_steady_clock = std::chrono::steady_clock::now();
    std::chrono::time_point<std::chrono::system_clock> end_system_clock = std::chrono::system_clock::now();

    std::time_t start_c = std::chrono::system_clock::to_time_t(start_system_clock);
    std::time_t end_c = std::chrono::system_clock::to_time_t(end_system_clock);
    std::cerr << "max_blocking_time_is_set_ = " << max_blocking_time_is_set_ << std::endl;
    std::cerr
        << "max_blocking_time_point_ = "
        << std::chrono::duration_cast<std::chrono::milliseconds>(max_blocking_time_point_.time_since_epoch()).count()
        << std::endl;
    std::cerr << "start steady_clock = "
              << std::chrono::duration_cast<std::chrono::milliseconds>(start_steady_clock.time_since_epoch()).count()
              << ",
        end steady_clock =
        "
        << std::chrono::duration_cast<std::chrono::milliseconds>(end_steady_clock.time_since_epoch()).count()
        << std::endl;
    std::cerr << "start system_clock = " << std::put_time(std::localtime(&start_c), "% F % T") << "."
              << std::setfill('0') << std::setw(3)
              << (std::chrono::duration_cast<std::chrono::milliseconds>(start_system_clock.time_since_epoch()) -
                  std::chrono::duration_cast<std::chrono::seconds>(start_system_clock.time_since_epoch()))
                     .count()
              << std::endl;

    std::cerr << "end system_clock = " << std::put_time(std::localtime(&end_c), "% F % T") << "." << std::setfill('0')
              << std::setw(3)
              << (std::chrono::duration_cast<std::chrono::milliseconds>(end_system_clock.time_since_epoch()) -
                  std::chrono::duration_cast<std::chrono::seconds>(end_system_clock.time_since_epoch()))
              << std::endl;

    std::cerr << "== typeid(*sender_)" << typeid(*sender_).name() << std::endl;
    std::cout << "dynamic_cast<LocatorSelectorSender *> " << dynamic_cast<LocatorSelectorSender *>(sender_)
              << std::endl;
    std::cout << "dynamic_cast<ReaderLocator *> " << dynamic_cast<ReaderLocator *>(sender_) << std::endl;
    std::cout << "dynamic_cast<DirectMessageSender *> " << dynamic_cast<DirectMessageSender *>(sender_) << std::endl;
    std::cout << "dynamic_cast<WriterProxy *> " << dynamic_cast<WriterProxy *>(sender_) << std::endl;
    throw timeout();
}
  • log1
    image

  • log2
    image

  • gdb bt1
    image

  • gdb bt2
    image

  • gdb bt3
    image

Steps to reproduce

  1. After the controller is powered up, the DDS-related processes start normally, and then the system time is adjusted, at which point the processes may crash occasionally.
  2. This problem is episodic and no clear steps to reproduce it have been found.

Fast DDS version/commit

2.5.0

Platform/Architecture

Other. Please specify in Additional context section.

Transport layer

Default configuration, UDPv4 & SHM

Additional context

  1. aarch64
  2. Linux
  3. UDP Transport

XML configuration file

use fastddsgen

Relevant log output

No response

Network traffic capture

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssue to report a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions