Infinite layout oscillation loop (hang) in  set_osec_offsets  with  --pack-dyn-relocs=android

When linking a shared library targeting ARM64 Android with  `--pack-dyn-relocs=android`  (Android Packed Relocations / APS2) and a large page size (e.g.  -z max-page-size=16384 ),  `mold`  can enter an infinite layout oscillation loop inside  `set_osec_offsets()`, causing the linker to hang indefinitely at 100% CPU on a single thread. 

(The repro steps and analysis were assisted by Gemini.)

  ## Steps to Reproduce

  ### 1. Save this 25-line assembly file as  repro.S :

    .section .custom_rodata_a, "a", %progbits
    .hidden A
    .global A
    A:
        .space 8

    .section .custom_rodata_b_huge, "a", %progbits
    .balign 16384
    .hidden B
    .global B
    B:
      	.space 8

    .section .data, "aw", %progbits
    # Generate exactly 7439 dummy relocations to pad .rela.dyn size
    .rept 7439
	.quad A
    .endr

    .global P_1
    P_1:
        .quad A

    .global P_2
    P_2:
        .quad B

  ### 2. Run the compiler and linker:

    # Compile the assembly
    clang++ --target=aarch64-linux-android29 -c repro.S -o repro.o

    # Link with mold (using pack-dyn-relocs=android)
    # This command hangs indefinitely (100% CPU on a single thread)
    mold -shared -o repro.so repro.o --pack-dyn-relocs=android -z max-page-size=16384

  ## Expected Behavior

  The linker should successfully resolve the layout, terminate within a second, and generate  repro.so .

  ## Actual Behavior

   `mold`  hangs indefinitely. If you capture a stack trace / dump core while it is hanging, it is almost always captured inside  `std::__insertion_sort`  called by  `mold::encode_android`  inside  `mold::RelDynSection<E>::update_shdr`.

  ## Technical Analysis & Root Cause

  The hang occurs because of an interaction between variable-length SLEB128 encoding used by Android packed relocations and alignment constraints in subsequent sections, leading to an infinite feedback loop with no iteration cap in  `set_osec_offsets()`:

    // src/passes.cc
    template <typename E>
    i64 set_osec_offsets(Context<E> &ctx) {
      for (;;) {
	if (ctx.arg.section_order.empty())
	  set_virtual_addresses_regular(ctx);
	...
	if (ctx.arg.pack_dyn_relocs_relr || ctx.arg.pack_dyn_relocs_android) {
	  i64 x = ctx.reldyn->shdr.sh_size;
	  ctx.reldyn->update_shdr(ctx);
	  if (x != (i64)ctx.reldyn->shdr.sh_size || ...)
            continue; // <--- Loops back endlessly if size oscillates
	}
        ...
      }
    }

  ### Detailed Oscillation Trace (at  N = 7439 ):

  1. Iteration 1:  .rela.dyn  size is estimated at 7393 bytes.
      • This places  .custom_rodata_a  (containing  A ) at address  0x2001  (8193).
      •  .custom_rodata_b_huge  (containing  B , aligned to 16KB) remains fixed at  0x4000  (16384) due to alignment padding.
      • The addend delta  B - A  is  16384 - 8193 = 8191 .
      •  8191  fits exactly in 2 bytes under SLEB128.
      •  encode_android()  encodes the relocation table, and because it fits in 2 bytes, the final calculated size of  .rela.dyn  is 7392 bytes.
      • Since $7393 \ne 7392$, the size changed $\implies$  continue  (loop repeats).
  2. Iteration 2:  .rela.dyn  size is now estimated at 7392 bytes (1 byte smaller).
      • This shifts  .custom_rodata_a  (and  A 's address) down by 1 byte to  0x2000  (8192).
      •  B  still remains fixed at  0x4000  (16384) due to the 16KB alignment padding.
      • The addend delta  B - A  becomes  16384 - 8192 = 8192 .
      •  8192  crosses the boundary and now requires 3 bytes in SLEB128.
      •  encode_android()  encodes the table, and because it requires 3 bytes, the final calculated size of  .rela.dyn  becomes 7393 bytes.
      • Since $7392 \ne 7393$, the size changed $\implies$  continue  (loop repeats).
  3. Iteration 3: Size estimate goes back to 7393 bytes, matching Iteration 1. The layout oscillates between 7392 and 7393 bytes infinitely.

  ### Note on the Stack Trace:

  Because  encode_android()  is called continuously in this infinite loop, and it performs a CPU-heavy  ranges::sort  over the entire dynamic relocation array in every iteration, the process spends >99% of its CPU cycles inside the sorting code. As a result, post-mortem dumps or debugger samples almost always capture it inside  std::__insertion_sort .
  ──────
  ## Proposed Fix

  The layout convergence loop in  set_osec_offsets()  should have a maximum iteration cap (e.g., 10 or 20 iterations). If the sizes fail to converge after the limit:

  1. Break the loop.
  2. Pad the section size ( sh_size ) to the maximum observed size during the iterations to ensure it is safely oversized and stable, rather than looping indefinitely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Infinite layout oscillation loop (hang) in set_osec_offsets with --pack-dyn-relocs=android #1595

Steps to Reproduce

1. Save this 25-line assembly file as repro.S :

2. Run the compiler and linker:

Expected Behavior

Actual Behavior

Technical Analysis & Root Cause

Detailed Oscillation Trace (at N = 7439 ):

Note on the Stack Trace:

Proposed Fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Infinite layout oscillation loop (hang) in set_osec_offsets with --pack-dyn-relocs=android #1595

Description

Steps to Reproduce

1. Save this 25-line assembly file as repro.S :

2. Run the compiler and linker:

Expected Behavior

Actual Behavior

Technical Analysis & Root Cause

Detailed Oscillation Trace (at N = 7439 ):

Note on the Stack Trace:

Proposed Fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions