-
Notifications
You must be signed in to change notification settings - Fork 158
Open
Description
What happens?
make installcheck might randomly stuck. setting duckdb.threads = 1 in test/regression/regression.conf could fix. So I guess it's a race condition.
upon stuck, the call stack looks as follows:
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
* frame #0: 0x000000019e6a8c20 libsystem_malloc.dylib`xzm_malloc
frame #1: 0x000000019e841a78 libc++abi.dylib`operator new(unsigned long) + 52
frame #2: 0x000000010be4069c libduckdb.dylib`duckdb::PendingQueryResult::LockContext() + 56
frame #3: 0x000000010be40a60 libduckdb.dylib`duckdb::PendingQueryResult::ExecuteTask() + 28
frame #4: 0x000000010315348c pg_duckdb.dylib`ExecuteQuery(DuckdbScanState*) + 824
frame #5: 0x0000000103153ad8 pg_duckdb.dylib`Duckdb_ExecCustomScan(CustomScanState*) + 112
frame #6: 0x00000001028ecd94 postgres`standard_ExecutorRun + 288
frame #7: 0x0000000102a96b98 postgres`PortalRunSelect + 220
frame #8: 0x0000000102a96860 postgres`PortalRun + 472
frame #9: 0x00000001028a6d7c postgres`ExecuteQuery + 388
frame #10: 0x0000000102a97dec postgres`standard_ProcessUtility + 1468
frame #11: 0x0000000103146200 pg_duckdb.dylib`DuckdbUtilityHook(PlannedStmt*, char const*, bool, ProcessUtilityContext, ParamListInfoData*, QueryEnvironment*, _DestReceiver*, QueryCompletion*) + 4252
frame #12: 0x0000000102a97520 postgres`PortalRunUtility + 136
frame #13: 0x0000000102a96aa8 postgres`FillPortalStore + 224
frame #14: 0x0000000102a96810 postgres`PortalRun + 392
frame #15: 0x0000000102a95a6c postgres`exec_simple_query + 1448
frame #16: 0x0000000102a9303c postgres`PostgresMain + 1152
frame #17: 0x0000000102a8f57c postgres`BackendMain + 80
frame #18: 0x0000000102a037ec postgres`postmaster_child_launch + 228
frame #19: 0x0000000102a078f0 postgres`ServerLoop + 6784
frame #20: 0x0000000102a05314 postgres`PostmasterMain + 3436
frame #21: 0x0000000102933efc postgres`main + 816
frame #22: 0x000000019e4c1d54 dyld`start + 7184
meanwhile duckdb workers looks as follows:
(lldb) th li
Process 23090 stopped
* thread #1: tid = 0x106b496, 0x000000019e6a9c8c libsystem_malloc.dylib`_xzm_free + 1352, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
thread #2: tid = 0x106b497, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #3: tid = 0x106b498, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #4: tid = 0x106b499, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #5: tid = 0x106b49a, 0x000000019e84cf30 libsystem_kernel.dylib`kevent + 8
thread #6: tid = 0x106b49b, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #7: tid = 0x106b49c, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #8: tid = 0x106b49d, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #9: tid = 0x106b49e, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #10: tid = 0x106b49f, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #11: tid = 0x106b4a0, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
thread #12: tid = 0x106b4a1, 0x000000019e846bb0 libsystem_kernel.dylib`semaphore_wait_trap + 8
(lldb) t 5
* thread #5
frame #0: 0x000000019e84cf30 libsystem_kernel.dylib`kevent + 8
libsystem_kernel.dylib`kevent:
-> 0x19e84cf30 <+8>: b.lo 0x19e84cf50 ; <+40>
0x19e84cf34 <+12>: pacibsp
0x19e84cf38 <+16>: stp x29, x30, [sp, #-0x10]!
0x19e84cf3c <+20>: mov x29, sp
(lldb) bt
* thread #5
* frame #0: 0x000000019e84cf30 libsystem_kernel.dylib`kevent + 8
frame #1: 0x0000000102a6f22c postgres`WaitEventSetWait + 408
frame #2: 0x0000000102a6efe0 postgres`WaitLatch + 96
frame #3: 0x00000001027e0234 postgres`WaitForParallelWorkersToFinish + 72
frame #4: 0x00000001028f00d0 postgres`ExecParallelFinish + 164
frame #5: 0x000000010318a278 pg_duckdb.dylib`pgduckdb::PostgresTableReader::CleanupUnsafe() + 44
frame #6: 0x000000010318a068 pg_duckdb.dylib`void pgduckdb::__PostgresMemberGuard__<pgduckdb::PostgresTableReader, void>(void (pgduckdb::PostgresTableReader::*)(), pgduckdb::PostgresTableReader*, char const*) + 160
frame #7: 0x0000000103189f90 pg_duckdb.dylib`pgduckdb::PostgresTableReader::Cleanup() + 68
frame #8: 0x0000000103185ce0 pg_duckdb.dylib`pgduckdb::PostgresScanGlobalState::UnregisterLocalState() + 80
frame #9: 0x00000001031867c0 pg_duckdb.dylib`pgduckdb::PostgresScanTableFunction::PostgresScanFunction(duckdb::ClientContext&, duckdb::TableFunctionInput&, duckdb::DataChunk&) + 568
frame #10: 0x000000010bce3e58 libduckdb.dylib`duckdb::PhysicalTableScan::GetData(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSourceInput&) const + 80
frame #11: 0x000000010bea9808 libduckdb.dylib`duckdb::PipelineExecutor::FetchFromSource(duckdb::DataChunk&) + 152
frame #12: 0x000000010bea5dd0 libduckdb.dylib`duckdb::PipelineExecutor::Execute(unsigned long long) + 260
frame #13: 0x000000010bea5b1c libduckdb.dylib`duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) + 328
frame #14: 0x000000010be9e1dc libduckdb.dylib`duckdb::ExecutorTask::Execute(duckdb::TaskExecutionMode) + 288
frame #15: 0x000000010bead19c libduckdb.dylib`duckdb::TaskScheduler::ExecuteForever(std::__1::atomic<bool>*) + 612
frame #16: 0x000000010beb5318 libduckdb.dylib`void* std::__1::__thread_proxy[abi:ne200100]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void (*)(duckdb::TaskScheduler*, std::__1::atomic<bool>*), duckdb::TaskScheduler*, std::__1::atomic<bool>*>>(void*) + 56
frame #17: 0x000000019e889c08 libsystem_pthread.dylib`_pthread_start + 136
To Reproduce
make installcheckOS:
macOS 26.2
pg_duckdb Version (if built from source use commit hash):
Postgres Version (if built from source use commit hash):
18.1
Hardware:
No response
Full Name:
qsliu
Affiliation:
zbyte inc.
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a source build
Did you include all relevant data sets for reproducing the issue?
Not applicable - the reproduction does not require a data set
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Linux distribution) to reproduce the issue?
- Yes, I have
Metadata
Metadata
Assignees
Labels
No labels