Skip to content

Code huge pages on lld-style PIE binaries (sublime, Discord, slack, libjvm)#5

Merged
anadav merged 4 commits into
mainfrom
feat/uniform-segment-shift
Apr 29, 2026
Merged

Code huge pages on lld-style PIE binaries (sublime, Discord, slack, libjvm)#5
anadav merged 4 commits into
mainfrom
feat/uniform-segment-shift

Conversation

@anadav
Copy link
Copy Markdown
Collaborator

@anadav anadav commented Apr 29, 2026

Summary

Three commits on top of main that add real code-huge-page support across all the LOAD-segment shapes hugifyr currently sees:

  • b4fa5f8 — regression test for chunk-isolation of LOAD RE's last 2MB chunk.
  • ceec465 — first cut at lld-style PIE: a padding-only path that establishes p_offset % 2MB == p_vaddr % 2MB on the exec LOAD without changing any vaddr. Safe but does not actually enable code huge pages.
  • 943d280 — replaces the padding-only path with an end-aligned section-aware shift. lld-style binaries (sublime, Discord, slack, libjvm-shape) now end LOAD RE on a 2MB boundary and become real code-huge-page candidates.

The motivation is the lld-style layout: .rodata / .eh_frame* / .gcc_except_table live in seg0 (the first read-only LOAD), and .text is RIP-referenced into them. The original main path shifted .text without shifting that data, breaking RIP-relative LEAs at runtime.

What changed in 943d280

  • AdjInfo now carries the seg0 movable vaddr ranges (everything in seg0 that's SHF_ALLOC, non-SHT_NOBITS, and not in relocatable_section_types). calc_adjusted_addr shifts addresses inside those ranges by the same vaddr_delta as everything at-or-after the exec LOAD — relocations / symbols / dynamic-section pointers / PT_GNU_EH_FRAME / DWARF references stay consistent.
  • adjust_program_headers extends seg0 LOAD R's filesz/memsz to cover the shifted contents and clamps LOAD RE's p_vaddr to max(round_down(p_vaddr, 2MB), seg0_end_after_shift) so seg0 LOAD R and LOAD RE never overlap in vaddr space.
  • segment_offset_delta and section_offset_delta are recomputed against the clamped p_vaddr so sh_offset - sh_addr == p_offset - p_vaddr holds for every section in LOAD RE (kernel constraint for a single file-backed mapping).
  • pad_segment_start rewritten to pad only the gap between the last non-executable section in LOAD RE and the first executable section — never over the ELF header / PHDR / .interp / .note*. This fixes the 2-LOAD R+E first ("combined" / -z noseparate-code) layout.
  • pad_offset_to_match_vaddr removed.
  • Test harness: new check_exec_load_end_aligned; check_re_chunk_isolation now requires only that fully-covered 2MB chunks be exclusive code; both checks wired into every test_load_layouts variant (default, combined, lld).

For modern 4-LOAD PIE and 2-LOAD R+E first ("combined") the new code is a no-op — seg0_end_after_shift = 0 collapses the clamp into the existing round_down.

Test plan

  • make clean.
  • cd tests && python3 test.pytest_basic, test_load_layouts (default / combined / lld), TLS, TLS relocations all pass; every layout variant satisfies offset%2MB == vaddr%2MB, end-aligned, and chunk-isolation.
  • Real lld-style PIE smoke test: sublime_text Build 4180, Discord 0.0.135, slack 4.42.117 — --version output identical to original; LOAD RE ends on 2MB; full-chunk isolation OK.
  • Modern PIE non-regression (megasync / cisco-cef equivalents) — output unchanged.
  • ET_EXEC fallback non-regression (cloudflared / anydesk / terraform / ngrok) — output unchanged.
  • libjvm.so end-to-end (tmp/jdk-21.0.11/lib/server/libjvm.so, 2-LOAD R+E first / ~20 MB of code): a Java workload exercising JIT, GC, parallel Streams, ConcurrentHashMap, ExecutorService, recursion, and 12 MB of allocation churn returns bit-identical output through the hugified library on the host.
  • /boot/vmlinuz-6.14.11rothp VM with READ_ONLY_THP_FOR_FS=y, hugified libjvm.so, THP=always: after running the workload, the libjvm.so r-xp mapping reports THPeligible: 1 (was 0 on host) and FilePmdMapped: 16384 kB — i.e. khugepaged collapsed eight 2 MB file PMDs over the 20 MB exec mapping. Workload output matches host.

🤖 Generated with Claude Code

anadav and others added 2 commits April 29, 2026 17:14
The hugifyr transformation aims to make the kernel grant code huge
pages on the binary's executable LOAD. For that to work, every 2MB
chunk that LOAD RE touches must be exclusively RE — if a non-exec
LOAD's vaddr range overlaps any of those chunks, mmap-order overlay
mixes protections and the kernel can't issue a code huge page on it.

Add a parser for readelf -lW LOAD entries plus check_re_chunk_isolation
which asserts no other LOAD's vaddr range intersects an RE 2MB chunk.
Wire it into test_basic. The check fires loudly if a future change to
the layout pass picks a vaddr_delta that's just large enough to land
.text on a 2MB boundary but not large enough to push subsequent LOADs
out of RE's last chunk — i.e. start-aligning instead of end-aligning
the executable segment.

Also add test_load_layouts that builds test1.c with default ld and with
-Wl,-z,noseparate-code (Oracle JDK-style combined R+E first segment)
and verifies hugifyr produces a runnable binary for each. The lld-style
layout (rodata in seg0, used by Chromium-based apps and Sublime Text)
isn't covered here because hugifyr's main path doesn't currently handle
it: shifting .text without also shifting seg0's rodata breaks
RIP-relative LEAs from code into rodata. Fixing that while keeping
end-alignment is separate work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The main shifting path crashes lld-style PIE binaries (Sublime Text,
Discord, Slack, MS Edge, Chrome, MongoDB) because their first read-only
LOAD ("seg0") carries .rodata / .eh_frame_hdr / .eh_frame /
.gcc_except_table — sections that .text RIP-references via direct LEA
displacements with no relocation entries. Shifting .text without
shifting those sections invalidates every cross-segment LEA and the
binary segfaults during dl_main / unwinder init.

This commit doesn't fix that fully — moving seg0's rodata into a
shifted segment with end-alignment preserved is structurally bigger
work. It establishes the necessary precondition: a safe transformation
that runs on lld-style binaries, leaves them runnable, and ensures the
exec LOAD's p_offset and p_vaddr have the same residue modulo 2MB.

Detection: seg0_has_movable_sections() walks sections at vaddrs below
the first PT_X LOAD's p_vaddr. Anything SHF_ALLOC, not SHT_NOBITS, and
not in the existing relocatable_section_types whitelist (which already
covers .dynsym, .gnu.hash, .rela.*, .dynamic, .interp, .note.*) is
considered RIP-referenced from code => the binary is lld-style. The
whitelist is conservative; unknown section types route to the safe
padding-only path rather than to the shifting path.

Padding-only path: pad_offset_to_match_vaddr() computes
delta = (p_vaddr_RE - p_offset_RE) mod 2MB, bumps p_offset of every
phdr at-or-after the original exec offset by delta, bumps every
section's sh_offset similarly, bumps e_shoff, and stamps the first
LOAD's p_align to 2MB. It does NOT touch any p_vaddr / sh_addr /
relocations / symbols / DWARF / build-id. The output is byte-identical
to the input except for the inserted file padding and the updated
offset fields. The transformed binary runs identically to the
original.

Tests:
- check_offset_vaddr_mod_2mb_match: asserts p_offset%2MB ==
  p_vaddr%2MB on the exec LOAD. Wired into test_basic and every
  test_load_layouts variant.
- test_load_layouts gets the lld variant back (built with
  -fuse-ld=lld); it now exercises the new padding-only path.

Verified on real-world closed-source PIE binaries we already had
downloaded:
  - Sublime Text 4180:    --version → "Sublime Text Build 4180" matches
  - Discord 0.0.135:      matches
  - Slack 4.42.117:       matches
  - MEGAsync (modern):    main path, matches
  - Cisco Webex CEF:      main path, matches
  - cloudflared/terraform: ET_EXEC fallback, unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 29, 2026 17:40
… path)

The padding-only path added in ceec465 only fixed the file-side
mod-2MB alignment of LOAD RE without changing any vaddr — so lld-style
binaries became correct but never huge-page-eligible. This commit
replaces it with a transformation that does enable code huge pages on
lld-style PIE.

What's new:
- AdjInfo carries a list of "movable seg0" vaddr ranges: sections in
  seg0 that are SHF_ALLOC, non-NOBITS, and NOT in
  relocatable_section_types (.rodata, .eh_frame, .eh_frame_hdr,
  .gcc_except_table). calc_adjusted_addr remaps addresses inside those
  ranges by the same vaddr_delta as everything at-or-after old_exec_vaddr,
  so RIP-relative LEAs from .text into .rodata stay valid after the
  shift. Empty for non-lld binaries (the existing behavior).
- adjust_program_headers extends seg0 LOAD R's filesz/memsz to cover
  the shifted seg0 contents, clamps LOAD RE's p_vaddr to
  max(round_down(p_vaddr,2MB), seg0_end_after_shift) so seg0 LOAD R
  and LOAD RE never overlap in vaddr space, and shifts PT_GNU_EH_FRAME
  (which targets a movable .eh_frame_hdr).
- adjust_section_headers shifts sh_offset for movable seg0 sections;
  seg0 has p_vaddr == p_offset == 0, so the file delta equals
  vaddr_delta.
- segment_offset_delta for lld-style is exec_p_vaddr_clamped -
  old_p_offset (LOAD RE's file region starts where extended seg0 ends);
  section_offset_delta accounts for the clamp so every section in
  LOAD RE has sh_offset_new - sh_addr_new == p_offset_new -
  p_vaddr_clamped (kernel constraint for a single LOAD's file mapping).
- pad_segment_start now fills the gap between the last non-exec section
  and the first executable section in LOAD RE — never below p_vaddr or
  over metadata. This avoids clobbering ELF header / PHDR / .interp /
  .note in the 2-LOAD R+E first ("combined" -z noseparate-code) layout.
- pad_offset_to_match_vaddr removed.

For modern PIE (4-LOAD with metadata-only seg0) and 2-LOAD R+E first
("combined") the new code is a no-op via the seg0_end_after_shift = 0
clamp degeneration.

Tests:
- check_segment_alignment unchanged for the modern path.
- New check_exec_load_end_aligned: every variant must have LOAD RE's end
  on a 2MB boundary.
- check_re_chunk_isolation relaxed to require only that fully-covered
  2MB chunks be exclusive code (partial chunks at the start/end of LOAD
  RE can legitimately share their range with adjacent LOAD R / LOAD RW).
- All three checks (offset/vaddr-mod, end-aligned, chunk-isolation) wired
  into every test_load_layouts variant including lld.

Verification:
- test_basic + test_load_layouts (default, combined, lld) + TLS +
  TLS-relocs all pass.
- Real-world smoke test on lld-style PIE: sublime_text (Build 4180),
  Discord (0.0.135), slack (4.42.117) all run identically; LOAD RE ends
  at 2MB, full chunks isolated.
- libjvm.so (Oracle JDK 21.0.11, 2-LOAD R+E first / 20MB code) runs the
  full Java workload (JIT, GC, Streams, ConcurrentHashMap, Executors,
  recursion) bit-identical to the original.
- Booted under /boot/vmlinuz-6.14.11rothp (READ_ONLY_THP_FOR_FS=y) with
  the hugified libjvm.so: THPeligible=1 (was 0 on host), khugepaged
  collapsed 16384 kB into 8 file-PMD-mapped 2MB pages on the libjvm.so
  r-xp mapping after running the workload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds lld-style PIE support to hugifyr’s huge-page transformation by shifting RIP-referenced seg0 content alongside the exec LOAD (instead of shifting .text alone), and strengthens regression coverage for multiple PT_LOAD layout shapes.

Changes:

  • Extend address-adjustment logic to also shift “movable” seg0 section ranges (lld-style layouts) and clamp exec LOAD start to avoid vaddr overlap.
  • Recompute offset deltas against the clamped exec p_vaddr, update PHDR/SHDR adjustments accordingly, and refine exec-segment padding behavior.
  • Expand test harness with readelf-based validations for offset/vaddr modulo matching, exec LOAD end alignment, and 2MB chunk isolation across layout variants.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/hugifyr.c Implements lld-style PIE handling via movable seg0 ranges, exec p_vaddr clamping, and updated PHDR/SHDR/offset adjustment logic.
tests/test.py Adds regression checks for exec LOAD modulo constraint, end alignment, and full-chunk isolation; adds a layout-variant test matrix (default/combined/lld).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/hugifyr.c
Comment on lines +1489 to +1493
phdr->p_vaddr = new_pvaddr;
phdr->p_paddr = new_pvaddr;
if (shrink > 0 && i == (size_t)info->exec_index) {
if (phdr->p_memsz > shrink) phdr->p_memsz -= shrink;
if (phdr->p_filesz > shrink) phdr->p_filesz -= shrink;
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the exec LOAD clamping path, when shrink > 0 you only subtract shrink from p_memsz/p_filesz if the size is greater than shrink. If shrink >= p_memsz this leaves the segment end shifted forward (breaking the intended end-alignment and potentially mapping unintended file bytes). This should be handled explicitly (e.g., validate shrink <= p_memsz and fail, or clamp sizes to 0 / recompute sizes consistently for the new p_vaddr).

Suggested change
phdr->p_vaddr = new_pvaddr;
phdr->p_paddr = new_pvaddr;
if (shrink > 0 && i == (size_t)info->exec_index) {
if (phdr->p_memsz > shrink) phdr->p_memsz -= shrink;
if (phdr->p_filesz > shrink) phdr->p_filesz -= shrink;
if (shrink > 0 && i == (size_t)info->exec_index) {
if (shrink > phdr->p_memsz || shrink > phdr->p_filesz) {
pr_error("Invalid executable LOAD clamp: shrink=%ju memsz=%ju filesz=%ju\n",
(uintmax_t)shrink, (uintmax_t)phdr->p_memsz,
(uintmax_t)phdr->p_filesz);
abort();
}
}
phdr->p_vaddr = new_pvaddr;
phdr->p_paddr = new_pvaddr;
if (shrink > 0 && i == (size_t)info->exec_index) {
phdr->p_memsz -= shrink;
phdr->p_filesz -= shrink;

Copilot uses AI. Check for mistakes.
Comment thread tests/test.py
Comment on lines +102 to +118
def check_exec_load_end_aligned(filename, huge=0x200000):
"""Verify that for the executable LOAD, p_vaddr + p_memsz lands on a
2MB boundary AND p_align is 2MB. The END being aligned is what makes
the last code huge page eligible — it's required regardless of
whether the START is also 2MB-aligned. (lld-style transformed
binaries have a non-aligned p_vaddr clamped to seg0_end_after_shift,
but the end is still extended to a 2MB boundary.)"""
loads = parse_load_segments(filename)
exec_load = next((l for l in loads if 'E' in l['flags']), None)
if not exec_load:
raise RuntimeError(f"{filename}: no executable LOAD")
end = exec_load['vaddr'] + exec_load['memsz']
if end % huge != 0:
raise RuntimeError(
f"{filename}: exec LOAD end 0x{end:x} (vaddr=0x{exec_load['vaddr']:x} + "
f"memsz=0x{exec_load['memsz']:x}) is not 2MB-aligned")
print(f"exec LOAD end 2MB-aligned OK in {filename} "
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_exec_load_end_aligned()'s docstring says it verifies both end alignment and that p_align is 2MB, but the implementation never parses or asserts p_align. Either update the docstring to match what’s actually checked, or extend parse_load_segments() to capture p_align and assert it here (so test_load_layouts() also enforces the alignment requirement).

Copilot uses AI. Check for mistakes.
Comment thread tests/test.py
Comment on lines +329 to +332
main shifting path would crash this binary. Routes through the new
padding-only path that establishes
p_offset%2MB == p_vaddr%2MB for the exec LOAD without changing any
vaddrs. Output runs identically to the original."""
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_load_layouts() docstring still describes the lld variant as going through a “padding-only path” that keeps vaddrs unchanged. In the current implementation (hugifyr.c) the lld-style case shifts seg0 movable sections and clamps the exec LOAD start, so vaddrs do change. Please update this docstring to reflect the current approach (section-aware shift / end-aligned clamp) so future readers don’t infer the wrong safety properties.

Suggested change
main shifting path would crash this binary. Routes through the new
padding-only path that establishes
p_offset%2MB == p_vaddr%2MB for the exec LOAD without changing any
vaddrs. Output runs identically to the original."""
generic whole-segment shift would crash this binary. Instead, the
lld-specific path shifts only movable seg0 sections, then clamps the
exec LOAD start to seg0_end_after_shift and pads the file as needed
to establish p_offset%2MB == p_vaddr%2MB. Output runs identically to
the original."""

Copilot uses AI. Check for mistakes.
@anadav anadav merged commit 46aea45 into main Apr 29, 2026
1 check passed
@anadav anadav deleted the feat/uniform-segment-shift branch April 29, 2026 18:57
anadav added a commit that referenced this pull request May 15, 2026
…ibjvm) (#5)

* Add chunk-isolation regression test for the exec LOAD's last 2MB

The hugifyr transformation aims to make the kernel grant code huge
pages on the binary's executable LOAD. For that to work, every 2MB
chunk that LOAD RE touches must be exclusively RE — if a non-exec
LOAD's vaddr range overlaps any of those chunks, mmap-order overlay
mixes protections and the kernel can't issue a code huge page on it.

Add a parser for readelf -lW LOAD entries plus check_re_chunk_isolation
which asserts no other LOAD's vaddr range intersects an RE 2MB chunk.
Wire it into test_basic. The check fires loudly if a future change to
the layout pass picks a vaddr_delta that's just large enough to land
.text on a 2MB boundary but not large enough to push subsequent LOADs
out of RE's last chunk — i.e. start-aligning instead of end-aligning
the executable segment.

Also add test_load_layouts that builds test1.c with default ld and with
-Wl,-z,noseparate-code (Oracle JDK-style combined R+E first segment)
and verifies hugifyr produces a runnable binary for each. The lld-style
layout (rodata in seg0, used by Chromium-based apps and Sublime Text)
isn't covered here because hugifyr's main path doesn't currently handle
it: shifting .text without also shifting seg0's rodata breaks
RIP-relative LEAs from code into rodata. Fixing that while keeping
end-alignment is separate work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Padding-only path for lld-style PIE; align p_offset%2MB to p_vaddr%2MB

The main shifting path crashes lld-style PIE binaries (Sublime Text,
Discord, Slack, MS Edge, Chrome, MongoDB) because their first read-only
LOAD ("seg0") carries .rodata / .eh_frame_hdr / .eh_frame /
.gcc_except_table — sections that .text RIP-references via direct LEA
displacements with no relocation entries. Shifting .text without
shifting those sections invalidates every cross-segment LEA and the
binary segfaults during dl_main / unwinder init.

This commit doesn't fix that fully — moving seg0's rodata into a
shifted segment with end-alignment preserved is structurally bigger
work. It establishes the necessary precondition: a safe transformation
that runs on lld-style binaries, leaves them runnable, and ensures the
exec LOAD's p_offset and p_vaddr have the same residue modulo 2MB.

Detection: seg0_has_movable_sections() walks sections at vaddrs below
the first PT_X LOAD's p_vaddr. Anything SHF_ALLOC, not SHT_NOBITS, and
not in the existing relocatable_section_types whitelist (which already
covers .dynsym, .gnu.hash, .rela.*, .dynamic, .interp, .note.*) is
considered RIP-referenced from code => the binary is lld-style. The
whitelist is conservative; unknown section types route to the safe
padding-only path rather than to the shifting path.

Padding-only path: pad_offset_to_match_vaddr() computes
delta = (p_vaddr_RE - p_offset_RE) mod 2MB, bumps p_offset of every
phdr at-or-after the original exec offset by delta, bumps every
section's sh_offset similarly, bumps e_shoff, and stamps the first
LOAD's p_align to 2MB. It does NOT touch any p_vaddr / sh_addr /
relocations / symbols / DWARF / build-id. The output is byte-identical
to the input except for the inserted file padding and the updated
offset fields. The transformed binary runs identically to the
original.

Tests:
- check_offset_vaddr_mod_2mb_match: asserts p_offset%2MB ==
  p_vaddr%2MB on the exec LOAD. Wired into test_basic and every
  test_load_layouts variant.
- test_load_layouts gets the lld variant back (built with
  -fuse-ld=lld); it now exercises the new padding-only path.

Verified on real-world closed-source PIE binaries we already had
downloaded:
  - Sublime Text 4180:    --version → "Sublime Text Build 4180" matches
  - Discord 0.0.135:      matches
  - Slack 4.42.117:       matches
  - MEGAsync (modern):    main path, matches
  - Cisco Webex CEF:      main path, matches
  - cloudflared/terraform: ET_EXEC fallback, unchanged

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* lld-style PIE: end-aligned section-aware shift (replaces padding-only path)

The padding-only path added in ceec465 only fixed the file-side
mod-2MB alignment of LOAD RE without changing any vaddr — so lld-style
binaries became correct but never huge-page-eligible. This commit
replaces it with a transformation that does enable code huge pages on
lld-style PIE.

What's new:
- AdjInfo carries a list of "movable seg0" vaddr ranges: sections in
  seg0 that are SHF_ALLOC, non-NOBITS, and NOT in
  relocatable_section_types (.rodata, .eh_frame, .eh_frame_hdr,
  .gcc_except_table). calc_adjusted_addr remaps addresses inside those
  ranges by the same vaddr_delta as everything at-or-after old_exec_vaddr,
  so RIP-relative LEAs from .text into .rodata stay valid after the
  shift. Empty for non-lld binaries (the existing behavior).
- adjust_program_headers extends seg0 LOAD R's filesz/memsz to cover
  the shifted seg0 contents, clamps LOAD RE's p_vaddr to
  max(round_down(p_vaddr,2MB), seg0_end_after_shift) so seg0 LOAD R
  and LOAD RE never overlap in vaddr space, and shifts PT_GNU_EH_FRAME
  (which targets a movable .eh_frame_hdr).
- adjust_section_headers shifts sh_offset for movable seg0 sections;
  seg0 has p_vaddr == p_offset == 0, so the file delta equals
  vaddr_delta.
- segment_offset_delta for lld-style is exec_p_vaddr_clamped -
  old_p_offset (LOAD RE's file region starts where extended seg0 ends);
  section_offset_delta accounts for the clamp so every section in
  LOAD RE has sh_offset_new - sh_addr_new == p_offset_new -
  p_vaddr_clamped (kernel constraint for a single LOAD's file mapping).
- pad_segment_start now fills the gap between the last non-exec section
  and the first executable section in LOAD RE — never below p_vaddr or
  over metadata. This avoids clobbering ELF header / PHDR / .interp /
  .note in the 2-LOAD R+E first ("combined" -z noseparate-code) layout.
- pad_offset_to_match_vaddr removed.

For modern PIE (4-LOAD with metadata-only seg0) and 2-LOAD R+E first
("combined") the new code is a no-op via the seg0_end_after_shift = 0
clamp degeneration.

Tests:
- check_segment_alignment unchanged for the modern path.
- New check_exec_load_end_aligned: every variant must have LOAD RE's end
  on a 2MB boundary.
- check_re_chunk_isolation relaxed to require only that fully-covered
  2MB chunks be exclusive code (partial chunks at the start/end of LOAD
  RE can legitimately share their range with adjacent LOAD R / LOAD RW).
- All three checks (offset/vaddr-mod, end-aligned, chunk-isolation) wired
  into every test_load_layouts variant including lld.

Verification:
- test_basic + test_load_layouts (default, combined, lld) + TLS +
  TLS-relocs all pass.
- Real-world smoke test on lld-style PIE: sublime_text (Build 4180),
  Discord (0.0.135), slack (4.42.117) all run identically; LOAD RE ends
  at 2MB, full chunks isolated.
- libjvm.so (Oracle JDK 21.0.11, 2-LOAD R+E first / 20MB code) runs the
  full Java workload (JIT, GC, Streams, ConcurrentHashMap, Executors,
  recursion) bit-identical to the original.
- Booted under /boot/vmlinuz-6.14.11rothp (READ_ONLY_THP_FOR_FS=y) with
  the hugified libjvm.so: THPeligible=1 (was 0 on host), khugepaged
  collapsed 16384 kB into 8 file-PMD-mapped 2MB pages on the libjvm.so
  r-xp mapping after running the workload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants