Skip to content

Proposal: Go Like linktime DCE #1587

@luoliwoshang

Description

@luoliwoshang

Goals

  • Reduce final binary size while preserving behavior compatibility.
  • Learn from Go linker's "reachability analysis + method table sentinel" model to handle Go semantic dependencies that default ELF/COFF linkers cannot see.
  • Lay the foundation for future whole-program passes on bitcode (get the marking right first; pruning/backfilling can iterate later).

Current State and Pain Points

  • Build pipeline: each .go/.c produces a separate .o (or .bc), then a single final link. Normal public symbols (functions/globals) can be pruned via ordinary relocations. The problems are Go semantic implicit dependencies.
  • Go features that create "implicit references":
    • itab / type metadata Ifn/Tfn pointers: once considered reachable, methods cannot be pruned even if never called.
    • Generics/interfaces/reflect: method/type reachability is not always inferable from direct instruction references.
    • Method tables hardcode real addresses, so the linker cannot see "optional" relationships and cannot apply sentinels.
  • Result: many unused methods/type metadata are retained and the binary grows.

Reference relationships between abi.Type and abi.Method

Example Go code (struct with methods):

type Game struct{}
func (g *Game) Load() {}
func (g *Game) initGame() {} // assume unexported

Generated metadata (simplified):

@_llgo_github.com/.../Game = {
  StructType { ... },
  UncommonType {
    PkgPath, Mcount=2, Xcount=1, Moff=... -> points to [2]Method
  },
  [2]Method [
    { Name="Load", Mtyp=&func(...), Ifn=@Game.Load (IFn), Tfn=@Game.Load (Tfn) },
    { Name="initGame", Mtyp=&func(...), Ifn=@Game.initGame, Tfn=@Game.initGame }
  ]
}

Symbol reference relationships:

  • abi.Type (the type descriptor for Game) contains UncommonType, whose Moff points to the method array [2]Method.
  • Each abi.Method contains:
    • Name (method name symbol, possibly full or package-qualified)
    • Mtyp (method function type symbol type.*func)
    • Ifn (wrapper symbol for interface calls)
    • Tfn (symbol for direct method calls)
  • In the default layout, Ifn/Tfn are real function pointers. The linker treats them as strong dependencies and cannot determine what can be pruned.

Current problem (from LLD's perspective):

  • Even if initGame is never called, the method table contains real Ifn/Tfn pointers. If main references Game, LLD treats all type metadata as reachable and thus keeps Tfn/Ifn (even if unused), so it cannot prune unused methods.

How Go does it

Phase 1: per-package marking

  • The compiler records semantic markers on each function/symbol (relocations or attributes). Each package emits its own .o:
    • R_USEIFACE: on type -> interface conversion, mark the current function symbol; target is the concrete type type.*.
      Example: in main, converting B to interface A writes R_USEIFACE in main.main pointing at type.B.
    • R_USEIFACEMETHOD: on interface method call, mark current function; target is the interface type.*, with method offset attached.
    • R_USENAMEDMETHOD: conservative mark when only the method name is known (generic interface call or MethodByName("Foo") constant name), written on current function symbol; target is the method name string.
    • R_METHODOFF in method tables: one each for Mtyp/Ifn/Tfn, embedded in type metadata.

Conceptual flow (per-package mark -> link-time reachability)
(Relation: main.main has R_USEIFACE(type.B) reaching type.B; some function has R_USENAMEDMETHOD("Foo") reaching method name Foo; type.B's method table has three R_METHODOFF, which can be real offsets or sentinels based on reachability.)

Concrete example: source -> markers (inside one .o)

// src/main.go
package main

type B struct{}
func (B) Load() {}
func (B) hidden() {}

type A interface { Load() }

func main() {
    var i A = B{}                 // write R_USEIFACE(type.B) on main.main
    _ = i.Load()                  // write R_USEIFACEMETHOD (iface type A + method offset) on main.main
    _ = reflect.TypeOf(i).MethodByName("Load") // if name is constant, write R_USENAMEDMETHOD("Load")
}

Where the marks are placed:

  • main.main relocation table:
    • R_USEIFACE -> Sym = type.B
    • R_USEIFACEMETHOD -> Sym = type.A, Add = itab offset for Load
    • R_USENAMEDMETHOD("Load") -> Sym = string "Load" (if name is constant)
  • type.B method table (uncommon.Methods):
    • For each method, three R_METHODOFF (Load, hidden each have three), reachability decides real offset vs sentinel.
  • Link-time flood: once main.main is reachable, these marks reach type.B, interface method needs, and method table Ifn/Tfn, then decide whether to write real offsets or sentinels.

(Compiler sources that write marks: cmd/compile/internal/reflectdata/reflect.go in MarkTypeUsedInInterface/MarkUsedIfaceMethod; cmd/compile/internal/walk/expr.go in usemethod, etc.)

Phase 2: global link-time reachability (deadcode)

  • With compile-time marks, the linker does a global flood:
    • Root set: main/main..inittask, runtime base symbols, plugin/export entries, runtime.unreachableMethod, etc. In shared library mode, conservatively mark all defined symbols in the library.
    • Traverse relocations: normal references mark reachable; interface/reflect marks drive method/type retention; method table R_METHODOFF are real offset if reachable, sentinel if not.
    • Reflection/generic marks (including AttrReflectMethod, R_USENAMEDMETHOD, etc.) decide conservative retention of exported methods.
  • Result: preserve truly needed interface/reflect paths and write sentinels for unused methods/metadata, allowing code to be dropped.
    • Dynamic export symbols are roots (dynexp list): all externally exported symbols are marked reachable during deadcode init to avoid incorrect pruning.

Simplest flood example (BFS queue):

// root: main
package main

func foo() {}
func bar() {}

func main() {
    foo() // main's reloc references foo
}

func foo() { // foo calls bar
    bar()
}

func bar() {}

Reachability traversal: root set contains main; pop main -> see reference to foo, mark(foo) and enqueue; pop foo -> see reference to bar, mark(bar) and enqueue; pop bar -> no new refs; queue empty. Reachable: main, foo, bar.

(Link-time reachability/backfill sources: cmd/link/internal/ld/deadcode.go handles root set and R_USEIFACE/R_USEIFACEMETHOD/R_USENAMEDMETHOD/R_METHODOFF; cmd/link/internal/ld/data.go writes real offsets for reachable R_METHODOFF and -1 sentinel for unreachable.)

Marks LLGo needs (align with Go)

  • Interface/reflect/generic:
    • R_USEIFACE, R_USEIFACEMETHOD, R_USENAMEDMETHOD, AttrReflectMethod.
  • Method table:
    • R_METHODOFF (Mtyp/Ifn/Tfn triplet, write real offset or sentinel).
  • Type metadata retention:
    • R_USETYPE (type descriptors for reflection/debugging).

Design approach (LLGo)

  1. Collection (SSA/IR generation)

    • While emitting LLVM IR, append a custom mark table (suggested global @__llgo_relocs) recording these marks, without changing method table layout.
    • For method tables, record R_METHODOFF for Mtyp/Ifn/Tfn; for interface conversions/calls and reflection, write the other marks.
  2. Merge (bitcode)

    • In GenBC mode, merge all modules, read @__llgo_relocs, and build a "symbol dependency + mark" graph.
  3. Reachability and backfill (later iteration)

    • Follow Go deadcode: root set (main/main..inittask, etc.) -> flood.
    • Types UsedInIface -> method table candidates; reachable R_METHODOFF write real offsets; unreachable write sentinel (-1/0).
    • Reflection/generic marks decide which exported methods or same-name methods must be conservatively retained.
  4. Output

    • Only emit @__llgo_relocs when marks exist and the option is enabled, to avoid impacting old workflows.
    • After sentinel backfill, pass to the regular linker/LLD.

Extra roots for LLGo adaptation (exports/special packages)

  • Go linker treats dynamically exported symbols (dynexp) as roots. LLGo should also mark externally exported/special entry symbols as roots to avoid pruning.
  • LLGo-specific C package and Python module init paths:
    • Package name C triggers ctx.initFiles(pkgPath, files, pkgName == "C") and generates C-layer exports.
    • Python modules initialize via ctx.initPyModule(), whose export entry should also be a root.
  • Suggested adaptation: at root-set creation, mark exported symbols, C package export entries, and Python module init symbols as reachable, then flood with the mark table.

Example (why a normal linker is not enough)

  • Struct T has an unused method Hidden, but its Ifn/Tfn are written into type metadata. A normal linker sees real function pointers and treats them as strong dependencies, so it cannot delete Hidden.
  • With R_METHODOFF: if flood deems Hidden unreachable, write -1 in the method table. Interface/reflect will not call it, and the body can be dropped.

Current progress

  • Added optional switch GenRelocLL to collect method table R_METHODOFF and emit @__llgo_relocs (only when records exist and the switch is on).
  • Default is still to not emit marks, to avoid differences in existing use cases.
  • Next work: complete interface/reflect/generic mark collection and flood/backfill logic.

Next steps (suggested)

  • Extend SSA collection: interface conversions/calls, reflect MethodByName, generic interface calls should write corresponding marks.
  • Implement a conservative flood: first handle UsedInIface + method table sentinels, verify size wins.
  • Then refine reflection retention strategy and R_USETYPE, gradually align with Go deadcode behavior.

Tradeoff: relocation info (reuse existing .o vs extra IR marks)

  • Normal function/data references: the compiler already emits standard relocations in .o/.bc (e.g., R_ADDR/R_CALL), and the linker can build A->B edges from those; no need to duplicate ordinary references in IR.
  • Go-specific semantics: standard relocations cannot express R_USEIFACE/R_USEIFACEMETHOD/R_USENAMEDMETHOD/R_METHODOFF marks, so IR/BC needs extra records (e.g., @__llgo_relocs) and the merged module uses them for reachability decisions and sentinel backfill.
  • To reuse "all reference info," you must read the .o (or LLVM's equivalent); merging BC alone is not a substitute for reading standard relocations. The value of BC merge is to unify custom marks and method tables in a global view.
  • If you decide to use IR/BC pass as the only entry (for both source and package reuse), avoid parsing Go obj relocation:
    • It creates "dual systems" (Go obj reloc + IR marks) and increases inconsistency/maintenance cost.
    • LLVM IR already expresses ordinary call/reference edges, and @__llgo_relocs covers Go semantic edges for a unified analysis.
  • For LLGo package reuse: compiled .o may be reused by other projects/links, so Go-semantic marks should be stored in the .o as well (R_USEIFACE/R_USEIFACEMETHOD/R_USENAMEDMETHOD/R_METHODOFF), so link-time pruning/backfill still works when not doing a local IR merge.
  • Current issue: LLGo's final output uses LLVM lld, which does not understand Go-specific marks/sentinels. If not handled beforehand, reachability pruning and method table backfill will not happen. A pre-link step (IR merge/custom pass) or custom link step must handle Go semantic pruning and sentinel writing before lld does normal layout.
  • Open issue: if lld does the final write, how do we modify metadata based on reachability (e.g., write -1 for unreachable R_METHODOFF)? A custom pass or pre-link step must modify constant initializers/relocations; otherwise lld only writes real addresses and cannot write sentinels.
  • Further concern: global metadata is spread across package .o files and may include third-party .o/.a. If you only prune at local IR level, you cannot delete existing metadata in .o; cross-package dependencies (A depends on B, C; A prunes C but B actually references C) can break. To prune reliably, you must handle all input .o marks and rewrite their metadata at final link time; otherwise external .o are unaffected.

Two possible implementation paths (current conclusion)

  1. Pre-link processing (feasible but limited): merge bitcode or read .o, run a custom tool/LLVM pass for reachability and metadata rewriting, then output a new .o for lld. Limitation: existing external .o/.a metadata cannot be rewritten at this stage (unless all inputs are parsed and rewritten), so reuse scenarios are still limited.
  2. Custom lld (high cost): modify/extend lld to recognize Go semantic marks and write sentinels/prune metadata at write-out time. Cost is high and outside normal usage.

Supplement: ship bitcode with .o, merge and run pass, then generate final .o

  • For self-produced .o, embed or sidecar bitcode (similar to "fat" object/ThinLTO). Before final link, extract all available bitcode, merge into one Module, run a custom pass (reachability + sentinel backfill), then emit a new .o for lld.
  • External .o/.a without bitcode are treated as black boxes and not pruned; such inputs should not carry Go-specific marks anyway when reused by the C ecosystem.
  • Value: enable Go semantic pruning for "our" code at global scope without requiring lld to understand the marks; limitation: third-party pure .o still cannot be rewritten.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions