Skip to content

macOS's libc calls calloc() after fork(), even in multi-threaded programs #1699

Open
@tavianator

Description

@tavianator

The macOS libc implementation runs its own atfork() handlers, which can allocate memory. I discovered this when my macOS CI hung while running tests with -fsanitize=thread. I was able to ssh in and get a backtrace like this:

* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007ff8036a05f6 libsystem_kernel.dylib`swtch_pri + 10
    frame #1: 0x00007ff8036dc8b6 libsystem_pthread.dylib`cthread_yield + 20
    frame #2: 0x0000000106977a09 libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::internal_sched_yield() + 9
    frame #3: 0x000000010697a5d5 libclang_rt.tsan_osx_dynamic.dylib`__sanitizer::StaticSpinMutex::LockSlow() + 53
    frame #4: 0x00000001069eaf06 libclang_rt.tsan_osx_dynamic.dylib`__tsan::DenseSlabAlloc<__tsan::MBlock, 262144ul, 4096ul, 3221225472ull>::Refill(__tsan::DenseSlabAllocCache*) + 470
    frame #5: 0x00000001069e9a37 libclang_rt.tsan_osx_dynamic.dylib`__tsan::MetaMap::AllocBlock(__tsan::ThreadState*, unsigned long, unsigned long, unsigned long) + 71
    frame #6: 0x00000001069c9f9c libclang_rt.tsan_osx_dynamic.dylib`__tsan::user_alloc_internal(__tsan::ThreadState*, unsigned long, unsigned long, unsigned long, bool) + 156
    frame #7: 0x00000001069caa42 libclang_rt.tsan_osx_dynamic.dylib`__tsan::user_calloc(__tsan::ThreadState*, unsigned long, unsigned long, unsigned long) + 66
    frame #8: 0x00000001069c7f63 libclang_rt.tsan_osx_dynamic.dylib`wrap_calloc + 115
    frame #9: 0x00007ff80f195097 libsystem_coreservices.dylib`_dirhelper_init + 49
    frame #10: 0x00007ff803709f86 libsystem_platform.dylib`_os_once_callout + 18
    frame #11: 0x00007ff80f549c73 libSystem.B.dylib`libSystem_atfork_child + 48
    frame #12: 0x00007ff8035ac598 libsystem_c.dylib`fork + 84
    frame #13: 0x000000010699da26 libclang_rt.tsan_osx_dynamic.dylib`wrap_fork + 70
    frame #14: 0x00000001064ff9f7 bfs`bfs_spawn(exe="/bin/echo", ctx=0x00007ff7b9ab96d8, argv=0x00007b0800001240, envp=0x00007ff7b9abad18) at xspawn.c:217:14
...

A multi-threaded process should only call async-signal-safe functions between fork() and exec(). But even following this rule in my own code, it's still possible to encounter hangs like the above since the platform itself is breaking that rule.

Why am I reporting this here instead of to Apple? Well first of all, I'm not an Apple customer. But mainly, since this is all within the implementation of libc, it's possible that their own allocator implementation guarantees that these calls are safe. TSan's replacement implementation, though, can't handle them. Since there's nothing a user can really do about this race, it would be great if TSan could work around it, perhaps by installing its own pthread_atfork() handlers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions