Skip to content

Infinite layout oscillation loop (hang) in set_osec_offsets with --pack-dyn-relocs=android #1595

@Jwata

Description

@Jwata

When linking a shared library targeting ARM64 Android with --pack-dyn-relocs=android (Android Packed Relocations / APS2) and a large page size (e.g. -z max-page-size=16384 ), mold can enter an infinite layout oscillation loop inside set_osec_offsets(), causing the linker to hang indefinitely at 100% CPU on a single thread.

(The repro steps and analysis were assisted by Gemini.)

Steps to Reproduce

1. Save this 25-line assembly file as repro.S :

.section .custom_rodata_a, "a", %progbits
.hidden A
.global A
A:
    .space 8

.section .custom_rodata_b_huge, "a", %progbits
.balign 16384
.hidden B
.global B
B:
  	.space 8

.section .data, "aw", %progbits
# Generate exactly 7439 dummy relocations to pad .rela.dyn size
.rept 7439
.quad A
.endr

.global P_1
P_1:
    .quad A

.global P_2
P_2:
    .quad B

2. Run the compiler and linker:

# Compile the assembly
clang++ --target=aarch64-linux-android29 -c repro.S -o repro.o

# Link with mold (using pack-dyn-relocs=android)
# This command hangs indefinitely (100% CPU on a single thread)
mold -shared -o repro.so repro.o --pack-dyn-relocs=android -z max-page-size=16384

Expected Behavior

The linker should successfully resolve the layout, terminate within a second, and generate repro.so .

Actual Behavior

mold hangs indefinitely. If you capture a stack trace / dump core while it is hanging, it is almost always captured inside std::__insertion_sort called by mold::encode_android inside mold::RelDynSection<E>::update_shdr.

Technical Analysis & Root Cause

The hang occurs because of an interaction between variable-length SLEB128 encoding used by Android packed relocations and alignment constraints in subsequent sections, leading to an infinite feedback loop with no iteration cap in set_osec_offsets():

// src/passes.cc
template <typename E>
i64 set_osec_offsets(Context<E> &ctx) {
  for (;;) {
if (ctx.arg.section_order.empty())
  set_virtual_addresses_regular(ctx);
...
if (ctx.arg.pack_dyn_relocs_relr || ctx.arg.pack_dyn_relocs_android) {
  i64 x = ctx.reldyn->shdr.sh_size;
  ctx.reldyn->update_shdr(ctx);
  if (x != (i64)ctx.reldyn->shdr.sh_size || ...)
        continue; // <--- Loops back endlessly if size oscillates
}
    ...
  }
}

Detailed Oscillation Trace (at N = 7439 ):

  1. Iteration 1: .rela.dyn size is estimated at 7393 bytes.
    • This places .custom_rodata_a (containing A ) at address 0x2001 (8193).
    • .custom_rodata_b_huge (containing B , aligned to 16KB) remains fixed at 0x4000 (16384) due to alignment padding.
    • The addend delta B - A is 16384 - 8193 = 8191 .
    • 8191 fits exactly in 2 bytes under SLEB128.
    • encode_android() encodes the relocation table, and because it fits in 2 bytes, the final calculated size of .rela.dyn is 7392 bytes.
    • Since $7393 \ne 7392$, the size changed $\implies$ continue (loop repeats).
  2. Iteration 2: .rela.dyn size is now estimated at 7392 bytes (1 byte smaller).
    • This shifts .custom_rodata_a (and A 's address) down by 1 byte to 0x2000 (8192).
    • B still remains fixed at 0x4000 (16384) due to the 16KB alignment padding.
    • The addend delta B - A becomes 16384 - 8192 = 8192 .
    • 8192 crosses the boundary and now requires 3 bytes in SLEB128.
    • encode_android() encodes the table, and because it requires 3 bytes, the final calculated size of .rela.dyn becomes 7393 bytes.
    • Since $7392 \ne 7393$, the size changed $\implies$ continue (loop repeats).
  3. Iteration 3: Size estimate goes back to 7393 bytes, matching Iteration 1. The layout oscillates between 7392 and 7393 bytes infinitely.

Note on the Stack Trace:

Because encode_android() is called continuously in this infinite loop, and it performs a CPU-heavy ranges::sort over the entire dynamic relocation array in every iteration, the process spends >99% of its CPU cycles inside the sorting code. As a result, post-mortem dumps or debugger samples almost always capture it inside std::__insertion_sort .
──────

Proposed Fix

The layout convergence loop in set_osec_offsets() should have a maximum iteration cap (e.g., 10 or 20 iterations). If the sizes fail to converge after the limit:

  1. Break the loop.
  2. Pad the section size ( sh_size ) to the maximum observed size during the iterations to ensure it is safely oversized and stable, rather than looping indefinitely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions