Skip to content

Running signal handler that uses thread local variables in statically linked executables crashes #1278

Open
@wkozaczuk

Description

@wkozaczuk

This can be reproduced by running iperf3 and then doing Ctrl-C which causes this:

OSv v0.57.0-91-g98c90f03
eth0: 192.168.122.15
Booted up in 126.00 ms
Cmdline: /lib64/ld-linux-x86-64.so.2 /iperf3 -s
syscall(): unimplemented system call 334
warning: this system does not seem to support IPv6 - trying IPv4
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Aborted

[backtrace]
0x000000004022133d <???+1075974973>
0x00000000403eace2 <osv::handle_mmap_fault(unsigned long, int, exception_frame*)+34>
0x00000000402b950a <mmu::vm_fault(unsigned long, exception_frame*)+298>
0x000000004030bd53 <page_fault+147>
0x000000004030aa82 <???+1076931202>

Connecting to gdb yields this:

(gdb) bt
#0  0x0000000040312942 in processor::cli_hlt () at arch/x64/processor.hh:247
#1  arch::halt_no_interrupts () at arch/x64/arch.hh:61
#2  osv::halt () at arch/x64/power.cc:29
#3  0x0000000040245344 in abort (fmt=fmt@entry=0x405cd413 "Aborted\n") at runtime.cc:145
#4  0x0000000040202ee5 in abort () at runtime.cc:106
#5  0x000000004022133e in osv::generate_signal (siginfo=..., ef=0x40007fe83088) at libc/signal.cc:141
#6  0x00000000403eace3 in osv::handle_mmap_fault (addr=addr@entry=0, sig=sig@entry=11, ef=ef@entry=0x40007fe83088) at libc/signal.cc:156
#7  0x00000000402b950b in mmu::vm_sigsegv (ef=0x40007fe83088, addr=0) at core/mmu.cc:1428
#8  mmu::vm_fault (addr=0, addr@entry=1688, ef=ef@entry=0x40007fe83088) at core/mmu.cc:1456
#9  0x000000004030bd54 in page_fault (ef=0x40007fe83088) at arch/x64/mmu.cc:42
#10 <signal handler called>
#11 __GI___pthread_cleanup_upto (target=target@entry=0x200000205080 <sigend_jmp_buf>, targetframe=targetframe@entry=0x40007fe93fc8 "\005\"D\177") at ./nptl/pthread_cleanup_upto.c:32
#12 0x000020007f44232c in _longjmp_unwind (env=env@entry=0x200000205080 <sigend_jmp_buf>, val=val@entry=1) at ../sysdeps/nptl/jmp-unwind.c:27
#13 0x000020007f442205 in __libc_siglongjmp (env=0x200000205080 <sigend_jmp_buf>, val=1) at ../setjmp/longjmp.c:30
#14 0x0000200000202543 in sigend_handler (sig=2) at main.c:121
#15 0x000000004037f0ee in sched::thread::main (this=0x40007fe7e040) at core/sched.cc:1415
#16 sched::thread_main_c (t=0x40007fe7e040) at arch/x64/arch-switch.hh:377
#17 0x000000004030bc52 in thread_main () at arch/x64/entry.S:161
(gdb) osv info threads
  id  address             name            cpu  status         vruntime  total_cpu_time location
   1 (0x400000019040) reclaimer       cpu0 waiting     4.52364e-25             799 condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:51
   2 (0x400000056040) kvm_wall_clock_ cpu0 waiting     9.36834e-22           12151 std::_Function_handler<void(), kvmclock::kvmclock()::<lambda()> >::_M_invoke(const std::_Any_data &) at /usr/include/c++/11/bits/std_function.h:290
   3 (0x400000074040) page_pool_l2    cpu0 waiting     8.80017e-22         1135716 memory::page_pool::l2::fill_thread() at core/mempool.cc:1496
   4 (0x400000094040) itimer-real     cpu? unstarted             0               0 ?? at arch/x64/entry.S:160
   5 (0x400000111040) itimer-virt     cpu? unstarted             0               0 ?? at arch/x64/entry.S:160
   6 (0x400000146040) balancer0       cpu0 waiting     2.23752e-21          108849 ?? at arch/x64/entry.S:161
   7 (0x40000015e040) rcu0            cpu0 waiting     9.56428e-25            1723 osv::rcu::cpu_quiescent_state_thread::do_work() at core/rcu.cc:181
   8 (0x400000177040) page_pool_l1_0  cpu0 waiting     8.98783e-23           42318 memory::page_pool::l1::fill_thread() at core/mempool.cc:1381
   9 (0x40000018f040) percpu0         cpu0 waiting     3.23117e-25             584 workman::call_of_duty() at core/percpu-worker.cc:92
  10 (0x4000001a7040) async_worker0   cpu0 waiting      4.3944e-25             760 async::async_worker::run() at core/async.cc:176
  11 (0x4000001bf040) idle0           cpu0 queued              inf      1015916699 std::_Function_handler<void(), sched::cpu::init_idle_thread()::<lambda()> >::_M_invoke(const std::_Any_data &) at /usr/include/c++/11/bits/std_function.h:290
  12 (0x400000a6e040) >init           cpu0 waiting     5.84196e-24             493 osv::acpi_interrupt::process_interrupts() at drivers/acpi.cc:250
  13 (0x400000a85040) >init           cpu0 waiting     5.97121e-24             689 std::_Function_handler<void(), sched::thread::reaper::reaper()::<lambda()> >::_M_invoke(const std::_Any_data &) at /usr/include/c++/11/bits/std_function.h:290
  14 (0x400000a9c040) thread taskq    cpu0 waiting     6.16508e-24            1019 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:102
  15 (0x400000aa7040) callout         cpu0 waiting     2.44065e-21          102382 _callout_thread() at bsd/porting/callout.cc:138
  16 (0x400000adc040) netisr          cpu0 waiting     1.64014e-23            2852 netisr_osv_thread_wrapper(void (*)(void*), void*) at bsd/sys/net/netisr1.cc:26
  17 (0x400000c4c040) >init           cpu0 waiting     5.45383e-20        79119367 sched::thread::join() at core/sched.cc:1562
	joining on <optimized out>
  18 (0x400000c66040) virtio-net-rx   cpu0 waiting     2.44023e-28           19225 virtio::virtio_driver::wait_for_queue(virtio::vring*, bool (virtio::vring::*)() const) at drivers/virtio.cc:158
  19 (0x400000c8e040) virtio-tx-0     cpu0 waiting     8.90519e-29             851 osv::xmitter<virtio::net::txq, 4096u, std::function<bool ()>, boost::iterators::function_output_iterator<osv::xmitter_functor<virtio::net::txq> > >::poll_until() at include/osv/percpu_xmit.hh:399
  20 (0x400000da9040) virtio-blk      cpu0 waiting     2.34544e-22          239819 virtio::virtio_driver::wait_for_queue(virtio::vring*, bool (virtio::vring::*)() const) at drivers/virtio.cc:158
  21 (0x400000dc0040) virtio-rng      cpu0 waiting     2.46307e-20          803801 condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:51
  22 (0x400000dda040) isa-serial-inpu cpu0 waiting      3.5655e-20          251753 console::LineDiscipline::read_poll(console::console_driver*) at drivers/line-discipline.cc:86
  23 (0x400000df1040) kbd-input       cpu0 waiting     1.45144e-23            7330 console::LineDiscipline::read_poll(console::console_driver*) at drivers/line-discipline.cc:86
  24 (0x400000e13040) rand_harvestq   cpu0 waiting     1.80029e-20          681103 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:102
  25 (0x400000e2d040) dhcp            cpu0 waiting     7.69433e-22          899366 dhcp::dhcp_worker::dhcp_worker_fn() at core/dhcp.cc:828
  26 (0x400000f53040) /lib64/ld-linux cpu0 waiting     1.21231e-20        12121800 do_poll(std::vector<poll_file, std::allocator<poll_file> >&, boost::optional<std::chrono::time_point<osv::clock::uptime, std::chrono::duration<long, std::ratio<1l, 1000000000l> > > >) at core/poll.cc:275
  27 (0x40007fe7e040) signal_handler  cpu0 running    -7.17013e-19               0 ?? at arch/x64/entry.S:160
Number of threads: 27

One can see that OSv executes the handler on the thread that does not have the same thread-local variables (see https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/nptl/pt-cleanup.c#L24C1-L38)/

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions