Skip to content

Regress tests stuck on macOS arm64 #1002

@qsliu2017

Description

@qsliu2017

What happens?

make installcheck might randomly stuck. setting duckdb.threads = 1 in test/regression/regression.conf could fix. So I guess it's a race condition.

upon stuck, the call stack looks as follows:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x000000019e6a8c20 libsystem_malloc.dylib`xzm_malloc
    frame #1: 0x000000019e841a78 libc++abi.dylib`operator new(unsigned long) + 52
    frame #2: 0x000000010be4069c libduckdb.dylib`duckdb::PendingQueryResult::LockContext() + 56
    frame #3: 0x000000010be40a60 libduckdb.dylib`duckdb::PendingQueryResult::ExecuteTask() + 28
    frame #4: 0x000000010315348c pg_duckdb.dylib`ExecuteQuery(DuckdbScanState*) + 824
    frame #5: 0x0000000103153ad8 pg_duckdb.dylib`Duckdb_ExecCustomScan(CustomScanState*) + 112
    frame #6: 0x00000001028ecd94 postgres`standard_ExecutorRun + 288
    frame #7: 0x0000000102a96b98 postgres`PortalRunSelect + 220
    frame #8: 0x0000000102a96860 postgres`PortalRun + 472
    frame #9: 0x00000001028a6d7c postgres`ExecuteQuery + 388
    frame #10: 0x0000000102a97dec postgres`standard_ProcessUtility + 1468
    frame #11: 0x0000000103146200 pg_duckdb.dylib`DuckdbUtilityHook(PlannedStmt*, char const*, bool, ProcessUtilityContext, ParamListInfoData*, QueryEnvironment*, _DestReceiver*, QueryCompletion*) + 4252
    frame #12: 0x0000000102a97520 postgres`PortalRunUtility + 136
    frame #13: 0x0000000102a96aa8 postgres`FillPortalStore + 224
    frame #14: 0x0000000102a96810 postgres`PortalRun + 392
    frame #15: 0x0000000102a95a6c postgres`exec_simple_query + 1448
    frame #16: 0x0000000102a9303c postgres`PostgresMain + 1152
    frame #17: 0x0000000102a8f57c postgres`BackendMain + 80
    frame #18: 0x0000000102a037ec postgres`postmaster_child_launch + 228
    frame #19: 0x0000000102a078f0 postgres`ServerLoop + 6784
    frame #20: 0x0000000102a05314 postgres`PostmasterMain + 3436
    frame #21: 0x0000000102933efc postgres`main + 816
    frame #22: 0x000000019e4c1d54 dyld`start + 7184

meanwhile duckdb workers looks as follows:

(lldb) th li
Process 23090 stopped
* thread #1: tid = 0x106b496, 0x000000019e6a9c8c libsystem_malloc.dylib`_xzm_free + 1352, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  thread #2: tid = 0x106b497, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #3: tid = 0x106b498, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #4: tid = 0x106b499, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #5: tid = 0x106b49a, 0x000000019e84cf30 libsystem_kernel.dylib`kevent + 8
  thread #6: tid = 0x106b49b, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #7: tid = 0x106b49c, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #8: tid = 0x106b49d, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #9: tid = 0x106b49e, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #10: tid = 0x106b49f, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #11: tid = 0x106b4a0, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
  thread #12: tid = 0x106b4a1, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
(lldb) t 5
* thread #5
    frame #0: 0x000000019e84cf30 libsystem_kernel.dylib`kevent + 8
libsystem_kernel.dylib`kevent:
->  0x19e84cf30 <+8>:  b.lo   0x19e84cf50    ; <+40>
    0x19e84cf34 <+12>: pacibsp
    0x19e84cf38 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x19e84cf3c <+20>: mov    x29, sp
(lldb) bt
* thread #5
  * frame #0: 0x000000019e84cf30 libsystem_kernel.dylib`kevent + 8
    frame #1: 0x0000000102a6f22c postgres`WaitEventSetWait + 408
    frame #2: 0x0000000102a6efe0 postgres`WaitLatch + 96
    frame #3: 0x00000001027e0234 postgres`WaitForParallelWorkersToFinish + 72
    frame #4: 0x00000001028f00d0 postgres`ExecParallelFinish + 164
    frame #5: 0x000000010318a278 pg_duckdb.dylib`pgduckdb::PostgresTableReader::CleanupUnsafe() + 44
    frame #6: 0x000000010318a068 pg_duckdb.dylib`void pgduckdb::__PostgresMemberGuard__<pgduckdb::PostgresTableReader, void>(void (pgduckdb::PostgresTableReader::*)(), pgduckdb::PostgresTableReader*, char const*) + 160
    frame #7: 0x0000000103189f90 pg_duckdb.dylib`pgduckdb::PostgresTableReader::Cleanup() + 68
    frame #8: 0x0000000103185ce0 pg_duckdb.dylib`pgduckdb::PostgresScanGlobalState::UnregisterLocalState() + 80
    frame #9: 0x00000001031867c0 pg_duckdb.dylib`pgduckdb::PostgresScanTableFunction::PostgresScanFunction(duckdb::ClientContext&, duckdb::TableFunctionInput&, duckdb::DataChunk&) + 568
    frame #10: 0x000000010bce3e58 libduckdb.dylib`duckdb::PhysicalTableScan::GetData(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSourceInput&) const + 80
    frame #11: 0x000000010bea9808 libduckdb.dylib`duckdb::PipelineExecutor::FetchFromSource(duckdb::DataChunk&) + 152
    frame #12: 0x000000010bea5dd0 libduckdb.dylib`duckdb::PipelineExecutor::Execute(unsigned long long) + 260
    frame #13: 0x000000010bea5b1c libduckdb.dylib`duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) + 328
    frame #14: 0x000000010be9e1dc libduckdb.dylib`duckdb::ExecutorTask::Execute(duckdb::TaskExecutionMode) + 288
    frame #15: 0x000000010bead19c libduckdb.dylib`duckdb::TaskScheduler::ExecuteForever(std::__1::atomic<bool>*) + 612
    frame #16: 0x000000010beb5318 libduckdb.dylib`void* std::__1::__thread_proxy[abi:ne200100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (*)(duckdb::TaskScheduler*, std::__1::atomic<bool>*), duckdb::TaskScheduler*, std::__1::atomic<bool>*>>(void*) + 56
    frame #17: 0x000000019e889c08 libsystem_pthread.dylib`_pthread_start + 136

To Reproduce

make installcheck

OS:

macOS 26.2

pg_duckdb Version (if built from source use commit hash):

6c43a1a

Postgres Version (if built from source use commit hash):

18.1

Hardware:

No response

Full Name:

qsliu

Affiliation:

zbyte inc.

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a source build

Did you include all relevant data sets for reproducing the issue?

Not applicable - the reproduction does not require a data set

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?

  • Yes, I have

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions