Skip to content

Dragonfly crash during replication: v1.30.0 #5135

Open
@andydunstall

Description

@andydunstall

Dragonfly crashed during replication on v1.30.0:

  • Environment: staging
  • Datastore ID: dst_u1xe53xd3
  • Node ID: node_plg7e8tdu
1747226665146	2025-05-14T12:44:25.146Z	PC: @     0x7ffff7c9eb2c  (unknown)  pthread_kill
1747226665146	2025-05-14T12:44:25.146Z	    @     0x555555748dd6  dfly::RdbSaver::Impl::~Impl()
1747226665146	2025-05-14T12:44:25.146Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665146	2025-05-14T12:44:25.146Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665146	2025-05-14T12:44:25.146Z	    @     0x555555748c37  dfly::RdbSaver::Impl::~Impl()
1747226665146	2025-05-14T12:44:25.146Z	*** SIGABRT received at time=1747226665 on cpu 6 ***
1747226665146	2025-05-14T12:44:25.146Z	    @     0x555555ff5f3f  make_fcontext
1747226665146	2025-05-14T12:44:25.146Z	    @     0x555555748c37  dfly::RdbSaver::Impl::~Impl()
1747226665146	2025-05-14T12:44:25.146Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665146	2025-05-14T12:44:25.146Z	    @     0x555555748c37  dfly::RdbSaver::Impl::~Impl()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555979375  _ZN5boost7context6detail11fiber_entryINS1_12fiber_recordINS0_5fiberEN4util3fb219FixedStackAllocatorEZNS6_6detail15WorkerFiberImplIZN4dfly14EngineShardSet21RunBlockingInParallelIZNSA_7DflyCmd11ReplicaInfo6CancelEvEUlPNSA_11EngineShardEE_ZNSB_21RunBlockingInParallelISH_EEvOT_EUlSJ_E_EEvSK_OT0_EUlvE_JEEC4IS7_EESt17basic_string_viewIcSt11char_traitsIcEENS6_13FiberPriorityERKNS0_12preallocatedESK_OSO_EUlOS4_E_EEEEvNS1_10transfer_tE
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555977242  _ZNSt17_Function_handlerIFvvEZN4dfly7DflyCmd21StartFullSyncInThreadEPNS1_8FlowInfoEPNS1_14ExecutionStateEPNS1_11EngineShardEEUlvE_E9_M_invokeERKSt9_Any_data
1747226665145	2025-05-14T12:44:25.145Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555748dd6  dfly::RdbSaver::Impl::~Impl()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555748c37  dfly::RdbSaver::Impl::~Impl()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555748c37  dfly::RdbSaver::Impl::~Impl()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665145	2025-05-14T12:44:25.145Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x555555748766  dfly::RdbSaver::Impl::CleanShardSnapshots()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x5555557595f6  dfly::SliceSnapshot::~SliceSnapshot()
1747226665144	2025-05-14T12:44:25.144Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561eaccf  google::LogMessageFatal::~LogMessageFatal()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561eaccf  google::LogMessageFatal::~LogMessageFatal()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x55555575931b  dfly::SliceSnapshot::~SliceSnapshot()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561e9347  google::LogMessage::Flush()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561e9347  google::LogMessage::Flush()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561e9347  google::LogMessage::Flush()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561eaccf  google::LogMessageFatal::~LogMessageFatal()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561f0ea3  google::LogMessage::SendToLog()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561f0ea3  google::LogMessage::SendToLog()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561eaccf  google::LogMessageFatal::~LogMessageFatal()
1747226665143	2025-05-14T12:44:25.143Z	    @     0x5555561f0ea3  google::LogMessage::SendToLog()
1747226665142	2025-05-14T12:44:25.142Z	    @     0x5555561e9347  google::LogMessage::Flush()
1747226665142	2025-05-14T12:44:25.142Z	    @     0x555555f8f46c  util::fb2::Fiber::~Fiber()
1747226665142	2025-05-14T12:44:25.142Z	    @     0x5555561e9347  google::LogMessage::Flush()
1747226665142	2025-05-14T12:44:25.142Z	    @     0x5555561eaccf  google::LogMessageFatal::~LogMessageFatal()
1747226665142	2025-05-14T12:44:25.142Z	*** Check failure stack trace: ***

I've attached the error logs for the node:
Logs-2025-05-15 07_37_16.txt

Still trying to understand details of when it crashed, will add to issue

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingimportanthigher priority than the usual ongoing development tasks

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions