-
Notifications
You must be signed in to change notification settings - Fork 46
Description
Goals
- Reduce final binary size while preserving behavior compatibility.
- Learn from Go linker's "reachability analysis + method table sentinel" model to handle Go semantic dependencies that default ELF/COFF linkers cannot see.
- Lay the foundation for future whole-program passes on bitcode (get the marking right first; pruning/backfilling can iterate later).
Current State and Pain Points
- Build pipeline: each .go/.c produces a separate .o (or .bc), then a single final link. Normal public symbols (functions/globals) can be pruned via ordinary relocations. The problems are Go semantic implicit dependencies.
- Go features that create "implicit references":
- itab / type metadata Ifn/Tfn pointers: once considered reachable, methods cannot be pruned even if never called.
- Generics/interfaces/reflect: method/type reachability is not always inferable from direct instruction references.
- Method tables hardcode real addresses, so the linker cannot see "optional" relationships and cannot apply sentinels.
- Result: many unused methods/type metadata are retained and the binary grows.
Reference relationships between abi.Type and abi.Method
Example Go code (struct with methods):
type Game struct{}
func (g *Game) Load() {}
func (g *Game) initGame() {} // assume unexportedGenerated metadata (simplified):
@_llgo_github.com/.../Game = {
StructType { ... },
UncommonType {
PkgPath, Mcount=2, Xcount=1, Moff=... -> points to [2]Method
},
[2]Method [
{ Name="Load", Mtyp=&func(...), Ifn=@Game.Load (IFn), Tfn=@Game.Load (Tfn) },
{ Name="initGame", Mtyp=&func(...), Ifn=@Game.initGame, Tfn=@Game.initGame }
]
}
Symbol reference relationships:
abi.Type(the type descriptor for Game) containsUncommonType, whoseMoffpoints to the method array[2]Method.- Each
abi.Methodcontains:Name(method name symbol, possibly full or package-qualified)Mtyp(method function type symboltype.*func)Ifn(wrapper symbol for interface calls)Tfn(symbol for direct method calls)
- In the default layout,
Ifn/Tfnare real function pointers. The linker treats them as strong dependencies and cannot determine what can be pruned.
Current problem (from LLD's perspective):
- Even if
initGameis never called, the method table contains realIfn/Tfnpointers. IfmainreferencesGame, LLD treats all type metadata as reachable and thus keepsTfn/Ifn(even if unused), so it cannot prune unused methods.
How Go does it
Phase 1: per-package marking
- The compiler records semantic markers on each function/symbol (relocations or attributes). Each package emits its own .o:
R_USEIFACE: on type -> interface conversion, mark the current function symbol; target is the concrete typetype.*.
Example: inmain, convertingBto interfaceAwritesR_USEIFACEinmain.mainpointing attype.B.R_USEIFACEMETHOD: on interface method call, mark current function; target is the interfacetype.*, with method offset attached.R_USENAMEDMETHOD: conservative mark when only the method name is known (generic interface call orMethodByName("Foo")constant name), written on current function symbol; target is the method name string.R_METHODOFFin method tables: one each for Mtyp/Ifn/Tfn, embedded in type metadata.
Conceptual flow (per-package mark -> link-time reachability)
(Relation: main.main has R_USEIFACE(type.B) reaching type.B; some function has R_USENAMEDMETHOD("Foo") reaching method name Foo; type.B's method table has three R_METHODOFF, which can be real offsets or sentinels based on reachability.)
Concrete example: source -> markers (inside one .o)
// src/main.go
package main
type B struct{}
func (B) Load() {}
func (B) hidden() {}
type A interface { Load() }
func main() {
var i A = B{} // write R_USEIFACE(type.B) on main.main
_ = i.Load() // write R_USEIFACEMETHOD (iface type A + method offset) on main.main
_ = reflect.TypeOf(i).MethodByName("Load") // if name is constant, write R_USENAMEDMETHOD("Load")
}Where the marks are placed:
main.mainrelocation table:- R_USEIFACE -> Sym = type.B
- R_USEIFACEMETHOD -> Sym = type.A, Add = itab offset for Load
- R_USENAMEDMETHOD("Load") -> Sym = string "Load" (if name is constant)
type.Bmethod table (uncommon.Methods):- For each method, three
R_METHODOFF(Load, hidden each have three), reachability decides real offset vs sentinel.
- For each method, three
- Link-time flood: once
main.mainis reachable, these marks reach type.B, interface method needs, and method table Ifn/Tfn, then decide whether to write real offsets or sentinels.
(Compiler sources that write marks: cmd/compile/internal/reflectdata/reflect.go in MarkTypeUsedInInterface/MarkUsedIfaceMethod; cmd/compile/internal/walk/expr.go in usemethod, etc.)
Phase 2: global link-time reachability (deadcode)
- With compile-time marks, the linker does a global flood:
- Root set:
main/main..inittask, runtime base symbols, plugin/export entries,runtime.unreachableMethod, etc. In shared library mode, conservatively mark all defined symbols in the library. - Traverse relocations: normal references mark reachable; interface/reflect marks drive method/type retention; method table
R_METHODOFFare real offset if reachable, sentinel if not. - Reflection/generic marks (including
AttrReflectMethod,R_USENAMEDMETHOD, etc.) decide conservative retention of exported methods.
- Root set:
- Result: preserve truly needed interface/reflect paths and write sentinels for unused methods/metadata, allowing code to be dropped.
- Dynamic export symbols are roots (
dynexplist): all externally exported symbols are marked reachable during deadcode init to avoid incorrect pruning.
- Dynamic export symbols are roots (
Simplest flood example (BFS queue):
// root: main
package main
func foo() {}
func bar() {}
func main() {
foo() // main's reloc references foo
}
func foo() { // foo calls bar
bar()
}
func bar() {}Reachability traversal: root set contains main; pop main -> see reference to foo, mark(foo) and enqueue; pop foo -> see reference to bar, mark(bar) and enqueue; pop bar -> no new refs; queue empty. Reachable: main, foo, bar.
(Link-time reachability/backfill sources: cmd/link/internal/ld/deadcode.go handles root set and R_USEIFACE/R_USEIFACEMETHOD/R_USENAMEDMETHOD/R_METHODOFF; cmd/link/internal/ld/data.go writes real offsets for reachable R_METHODOFF and -1 sentinel for unreachable.)
Marks LLGo needs (align with Go)
- Interface/reflect/generic:
R_USEIFACE,R_USEIFACEMETHOD,R_USENAMEDMETHOD,AttrReflectMethod.
- Method table:
R_METHODOFF(Mtyp/Ifn/Tfn triplet, write real offset or sentinel).
- Type metadata retention:
R_USETYPE(type descriptors for reflection/debugging).
Design approach (LLGo)
-
Collection (SSA/IR generation)
- While emitting LLVM IR, append a custom mark table (suggested global
@__llgo_relocs) recording these marks, without changing method table layout. - For method tables, record
R_METHODOFFfor Mtyp/Ifn/Tfn; for interface conversions/calls and reflection, write the other marks.
- While emitting LLVM IR, append a custom mark table (suggested global
-
Merge (bitcode)
- In
GenBCmode, merge all modules, read@__llgo_relocs, and build a "symbol dependency + mark" graph.
- In
-
Reachability and backfill (later iteration)
- Follow Go
deadcode: root set (main/main..inittask, etc.) -> flood. - Types UsedInIface -> method table candidates; reachable
R_METHODOFFwrite real offsets; unreachable write sentinel (-1/0). - Reflection/generic marks decide which exported methods or same-name methods must be conservatively retained.
- Follow Go
-
Output
- Only emit
@__llgo_relocswhen marks exist and the option is enabled, to avoid impacting old workflows. - After sentinel backfill, pass to the regular linker/LLD.
- Only emit
Extra roots for LLGo adaptation (exports/special packages)
- Go linker treats dynamically exported symbols (dynexp) as roots. LLGo should also mark externally exported/special entry symbols as roots to avoid pruning.
- LLGo-specific C package and Python module init paths:
- Package name
Ctriggersctx.initFiles(pkgPath, files, pkgName == "C")and generates C-layer exports. - Python modules initialize via
ctx.initPyModule(), whose export entry should also be a root.
- Package name
- Suggested adaptation: at root-set creation, mark exported symbols, C package export entries, and Python module init symbols as reachable, then flood with the mark table.
Example (why a normal linker is not enough)
- Struct
Thas an unused methodHidden, but its Ifn/Tfn are written into type metadata. A normal linker sees real function pointers and treats them as strong dependencies, so it cannot deleteHidden. - With
R_METHODOFF: if flood deemsHiddenunreachable, write -1 in the method table. Interface/reflect will not call it, and the body can be dropped.
Current progress
- Added optional switch
GenRelocLLto collect method tableR_METHODOFFand emit@__llgo_relocs(only when records exist and the switch is on). - Default is still to not emit marks, to avoid differences in existing use cases.
- Next work: complete interface/reflect/generic mark collection and flood/backfill logic.
Next steps (suggested)
- Extend SSA collection: interface conversions/calls, reflect MethodByName, generic interface calls should write corresponding marks.
- Implement a conservative flood: first handle UsedInIface + method table sentinels, verify size wins.
- Then refine reflection retention strategy and
R_USETYPE, gradually align with Go deadcode behavior.
Tradeoff: relocation info (reuse existing .o vs extra IR marks)
- Normal function/data references: the compiler already emits standard relocations in .o/.bc (e.g., R_ADDR/R_CALL), and the linker can build A->B edges from those; no need to duplicate ordinary references in IR.
- Go-specific semantics: standard relocations cannot express
R_USEIFACE/R_USEIFACEMETHOD/R_USENAMEDMETHOD/R_METHODOFFmarks, so IR/BC needs extra records (e.g.,@__llgo_relocs) and the merged module uses them for reachability decisions and sentinel backfill. - To reuse "all reference info," you must read the .o (or LLVM's equivalent); merging BC alone is not a substitute for reading standard relocations. The value of BC merge is to unify custom marks and method tables in a global view.
- If you decide to use IR/BC pass as the only entry (for both source and package reuse), avoid parsing Go obj relocation:
- It creates "dual systems" (Go obj reloc + IR marks) and increases inconsistency/maintenance cost.
- LLVM IR already expresses ordinary call/reference edges, and
@__llgo_relocscovers Go semantic edges for a unified analysis.
- For LLGo package reuse: compiled .o may be reused by other projects/links, so Go-semantic marks should be stored in the .o as well (
R_USEIFACE/R_USEIFACEMETHOD/R_USENAMEDMETHOD/R_METHODOFF), so link-time pruning/backfill still works when not doing a local IR merge. - Current issue: LLGo's final output uses LLVM lld, which does not understand Go-specific marks/sentinels. If not handled beforehand, reachability pruning and method table backfill will not happen. A pre-link step (IR merge/custom pass) or custom link step must handle Go semantic pruning and sentinel writing before lld does normal layout.
- Open issue: if lld does the final write, how do we modify metadata based on reachability (e.g., write -1 for unreachable
R_METHODOFF)? A custom pass or pre-link step must modify constant initializers/relocations; otherwise lld only writes real addresses and cannot write sentinels. - Further concern: global metadata is spread across package .o files and may include third-party .o/.a. If you only prune at local IR level, you cannot delete existing metadata in .o; cross-package dependencies (A depends on B, C; A prunes C but B actually references C) can break. To prune reliably, you must handle all input .o marks and rewrite their metadata at final link time; otherwise external .o are unaffected.
Two possible implementation paths (current conclusion)
- Pre-link processing (feasible but limited): merge bitcode or read .o, run a custom tool/LLVM pass for reachability and metadata rewriting, then output a new .o for lld. Limitation: existing external .o/.a metadata cannot be rewritten at this stage (unless all inputs are parsed and rewritten), so reuse scenarios are still limited.
- Custom lld (high cost): modify/extend lld to recognize Go semantic marks and write sentinels/prune metadata at write-out time. Cost is high and outside normal usage.
Supplement: ship bitcode with .o, merge and run pass, then generate final .o
- For self-produced .o, embed or sidecar bitcode (similar to "fat" object/ThinLTO). Before final link, extract all available bitcode, merge into one Module, run a custom pass (reachability + sentinel backfill), then emit a new .o for lld.
- External .o/.a without bitcode are treated as black boxes and not pruned; such inputs should not carry Go-specific marks anyway when reused by the C ecosystem.
- Value: enable Go semantic pruning for "our" code at global scope without requiring lld to understand the marks; limitation: third-party pure .o still cannot be rewritten.