Skip to content

[BUG] CN crash with SIGSEGV in publish_primary_compaction on shared-data cluster (3.5.10, aarch64) #66832

@EdwardArchive

Description

@EdwardArchive

Steps to reproduce the behavior (Required)

Environment:

  • StarRocks Version: 3.5.10 RELEASE (build 9848c7f)
  • Architecture: aarch64 (ARM64)
  • OS: Ubuntu
  • Cluster Mode: Shared-data (lake mode)
  • Table Type: Primary Key table

Reproduction scenario:


Expected behavior (Required)

Compaction transactions for Primary Key tables should be published successfully without crashing the CN node.


Real behavior (Required)

CN node crashes with SIGSEGV (Segmentation Fault) during the publish phase of Primary Key table compaction.

Stack Trace:

3.5.10 RELEASE (build 9848c7f distro ubuntu arch aarch64)
query_id:00000000-0000-0000-0000-000000000000, fragment_instance:00000000-0000-0000-0000-000000000000, plan_node_id:-1
*** Aborted at 1765943709 (unix time) try "date -d @1765943709" if you are using GNU date ***
PC: @          0xf119ab0 google::protobuf::RepeatedPtrField<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::empty() const
*** SIGSEGV (@0x120) received by PID 27 (TID 0xffff1111fe80) LWP(410) from PID 288; stack trace: ***
    @     0xffffbdcb53dc (/usr/lib/aarch64-linux-gnu/libc.so.6+0x853db)
    @          0xed09718 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
    @     0xffffbf36b838 ([vdso]+0x837)
    @          0xf119ab0 google::protobuf::RepeatedPtrField<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::empty() const
    @          0x99600e8 starrocks::lake::UpdateManager::publish_primary_compaction(starrocks::TxnLogPB_OpCompaction const&, long, starrocks::TabletMetadataPB const&, starrocks::lake::Tablet const&, starrocks::DynamicCache<unsigned long, starrocks::lake::LakePrimaryIndex, std::mut
    @          0xb4e1e38 std::_Function_handler<starrocks::Status (), starrocks::lake::PrimaryKeyTxnLogApplier::apply(starrocks::TxnLogPB const&)::{lambda()#2}>::_M_invoke(std::_Any_data const&)
    @          0xb4e28f0 starrocks::lake::PrimaryKeyTxnLogApplier::check_and_recover(std::function<starrocks::Status ()> const&)
    @          0xb4e5248 starrocks::lake::PrimaryKeyTxnLogApplier::apply(starrocks::TxnLogPB const&)
    @          0xb4dd4c0 starrocks::lake::publish_version(starrocks::lake::TabletManager*, long, long, long, std::span<starrocks::TxnInfoPB const, 18446744073709551615ul>)
    @          0xb4cf218 starrocks::LakeServiceImpl::publish_version(google::protobuf::RpcController*, starrocks::PublishVersionRequest const*, starrocks::PublishVersionResponse*, google::protobuf::Closure*)::{lambda()#1}::operator()() const
    @          0xb6426e0 std::_Function_handler<void (), starrocks::ConcurrencyLimitedThreadPoolToken::submit(std::shared_ptr<starrocks::Runnable>, std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >)::{lambda()#1}>::
    @          0xb649b94 starrocks::ThreadPool::dispatch_thread()
    @          0xb64017c starrocks::Thread::supervise_thread(void*)
    @     0xffffbdcb0398 (/usr/lib/aarch64-linux-gnu/libc.so.6+0x80397)
    @     0xffffbdd19e9c (/usr/lib/aarch64-linux-gnu/libc.so.6+0xe9e9b)

Memory statistics before crash:

I20251217 12:55:06.497229 281473749810816 daemon.cpp:139] Current memory statistics: 
process(1048808704), query_pool(348896), load(0), metadata(396326), compaction(0), 
schema_change(0), page_cache(91456), update(786560), chunk_allocator(0), passthrough(0), 
clone(0), consistency(0), datacache(496770243), jit(0)

StarRocks version (Required)

3.5.10 RELEASE (build 9848c7f distro ubuntu arch aarch64)

Metadata

Metadata

Assignees

Labels

type/bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions