Description
Background: when linking macOS binaries with chained fixups, we need to transform initializers stored in __mod_init_func
(an array of pointers rebased through the usual means at runtime) to __init_offsets
(an array of 32-bit offsets to initializers).
Normally, we only need to care about the relocations in the input __mod_init_func
sections. A problem arises when there are also symbols defined inside it. We currently ignore them in LLD -- that is, we don't add them to the symbol table (since the location they point to don't exist anymore). This doesn't happen in regular binaries created by Clang, swiftc
, rustc
, etc., but there have been a few instances, where this led to crashes:
-
In #94716, we see a go-generated binary (the repro file is broken and doesn't include a bunch of swift stuff from the SDK -- TODO!). Here, the symbol (
__rt0_arm64_ios_lib.ptr
) is defined inside__mod_init_func
as a non-exported symbol; we crash when trying to add it to the symbol table (it has no corresponding output section, so we can't setn_sect
).$ lipo -thin arm64 Users/snqu/Desktop/chromium/chromium/src/ios/chrome/vpn_extension/Tun2socks.framework/Tun2socks -o Tun2socks $ $(brew --prefix llvm)/bin/llvm-ar x Tun2socks go.o $ nm go.o | grep __rt0_arm64_ios_lib\.ptr 0000000001af3a60 s __rt0_arm64_ios_lib.ptr
Backtrace excerpt:
* thread #6, stop reason = EXC_BAD_ACCESS (code=1, address=0x34) * frame #0: 0x00000001002a0894 ld64.lld`SymtabSectionImpl<lld::macho::LP64>::writeTo(this=<unavailable>, buf=<unavailable>) const at SyntheticSections.cpp:1415:50 [opt]
-
This Chromium bug is related to the
curl
Rust crate, which deliberately defines a symbol among the initializers, apparently, to sidestep an old linker/compiler dead-stripping issue. Here, the symbol (__RNvCsiLjxBhyzEAX_4curl9INIT_CTOR
;curl::INIT_CTOR
) is externally visible, we crash when we try to query its address when adding it to the exports trie.Backtrace excerpt:
frame #3: 0x000000019cf14d20 libsystem_c.dylib`__assert_rtn + 284 frame #4: 0x00000001003053e8 ld64.lld`lld::macho::Defined::getVA() const (.cold.2) at Symbols.cpp:97:5 [opt] * frame #5: 0x0000000100289ee0 ld64.lld`lld::macho::Defined::getVA(this=0x000000014b01ca58) const at Symbols.cpp:97:5 [opt] frame #6: 0x0000000100255fe4 ld64.lld`lld::macho::TrieBuilder::sortAndBuild(llvm::MutableArrayRef<lld::macho::Symbol const*>, lld::macho::TrieNode*, unsigned long, unsigned long) [inlined] (anonymous namespace)::ExportInfo::ExportInfo(this=<unavailable>, sym=0x000000014b01ca58, imageBase=4294967296) at ExportTrie.cpp:64:21 [opt]
$ ar x Users/hwennborg/chromium/src/third_party/rust-src/build/x86_64-apple-darwin/stage2-tools/x86_64-apple-darwin/release/deps/libcurl-f5fa2775c6309a20.rlib curl-f5fa2775c6309a20.curl.da8bc63371d5233d-cgu.09.rcgu.o $ nm curl-f5fa2775c6309a20.curl.da8bc63371d5233d-cgu.09.rcgu.o -s __DATA __mod_init_func 0000000000000d40 S __RNvCsiLjxBhyzEAX_4curl9INIT_CTOR
Fix ideas
-
Completely remove
__mod_init_func
from the list of input sectionsI thought my original patch would have this effect: we do not create an OutputSection for it and don't even include it in the global
inputSections
list.In reality, this is not enough; they are still added to the symbol table (as
__mod_init_func
is present in thesymbols
array,ObjFile::parseSymbols
will reach it).(+) if we encounter a
Defined
symbol during the program's execution, we know for sure that it has an address, no need to check for a poison flag.(-) sounds a bit hackish; currently there is a one-on-one correspondence between
ObjFile::sections
and the input file's contents -
Create a "poisoned" state for
Defined
Symbols(+) Least amount of modification for the existing code
(+) There will still be an entry (though poisoned) in the symbol table, so we'll be able to emit useful warnings if someone actually refers to the symbol.
NOTE: this is basically the current workaround I ended up going for, except that I use the
isLive()
mechanism from dead-stripping. -
Some other idea?
- maybe just add a few extra checks to the known crashy places to see if the symbols refer to a deleted section? sounds very fragile
These mentioned workarounds only work if the symbols are not actually referenced in relocations. If they are, we get different (but equally undesirable) behaviors.
; test.s
.globl _main
.text
_main:
leaq _init_slot(%rip), %rax
.section __DATA,__mod_init_func,mod_init_funcs
_init_slot:
.quad _main
ld64 crash
❯ clang test.s -Wl,-ld_classic
ld: warning: alignment (1) of atom '_init_slot' is too small and may result in unaligned pointers
0 0x10659e807 __assert_rtn + 137
1 0x1065a79e3 ld::tool::OutputFile::addressOf(ld::Internal const&, ld::Fixup const*, ld::Atom const**) (.cold.1) + 35
2 0x1063fc5e4 ld::tool::OutputFile::addressOf(ld::Internal const&, ld::Fixup const*, ld::Atom const**) + 116
3 0x1063fd087 ld::tool::OutputFile::applyFixUps(ld::Internal&, unsigned long long, ld::Atom const*, unsigned char*) + 599
4 0x106405818 ___ZN2ld4tool10OutputFile10writeAtomsERNS_8InternalEPh_block_invoke + 504
5 0x7ff815881def _dispatch_client_callout2 + 8
6 0x7ff815893547 _dispatch_apply_invoke3 + 431
7 0x7ff815881dbc _dispatch_client_callout + 8
8 0x7ff81588304e _dispatch_once_callout + 20
9 0x7ff815892740 _dispatch_apply_invoke + 184
10 0x7ff815881dbc _dispatch_client_callout + 8
11 0x7ff8158912ca _dispatch_root_queue_drain + 871
12 0x7ff81589184f _dispatch_worker_thread2 + 152
13 0x7ff815a1fb43 _pthread_wqthread + 262
A linker snapshot was created at:
/tmp/a.out-2024-06-25-091629.ld-snapshot
ld: Assertion failed: (_mode == modeFinalAddress), function finalAddress, file ld.hpp, line 1413.
ld_prime broken (?) binary
❯ clang test.s -Wl,-ld_new
❯ objdump -d a.out
a.out: file format mach-o 64-bit x86-64
Disassembly of section __TEXT,__text:
0000000100000f9d <_main>:
100000f9d: 48 8d 05 5c f0 ff ff leaq -4004(%rip), %rax ## 0x100000000
# Relocation refers to the beginning of the file, `__mh_execute_header`???
Additional questions
There have been other similar transformations added recently: ObjC relative method lists, (etc?). Could we theoretically encounter a similar scenario there?