Skip to content

Commit 785b68d

Browse files
pyricauclaude
andcommitted
Add LinkEvalDominatorTree: exact Lengauer-Tarjan dominator tree
## Background DominatorTree builds an approximate dominator tree incrementally during BFS, using Lowest Common Ancestor (LCA) to update immediate dominators when a node is reached via multiple paths. This approximation breaks when a cross-edge (to an already-visited node) is processed while the parent's dominator is still stale — the child's dominator is left too specific, causing over-attribution of retained sizes. A convergence loop (re-running LCA over all cross-edges until stable) was explored as a fix, but benchmarks showed it is unsuitable for production: on a 25 MB heap dump it required 781 iterations and added ~62 seconds on top of a 1.5-second analysis (~40× overhead). The cost scales with the depth of the longest stale-dominator propagation chain and the number of cross-edges, making it unpredictable in general. ## Solution New class LinkEvalDominatorTree implements the exact Lengauer-Tarjan algorithm with link-eval (union-find path compression), adapted from Android Studio's perflib/.../LinkEvalDominators.kt (Apache 2.0, http://adambuchsbaum.com/papers/dom-toplas.pdf). The result is always exact regardless of graph topology. The algorithm runs in 4 phases: 1. Iterative DFS — assigns depth-first numbers (DFNs), records each node's DFS-tree parent, and accumulates all edges (tree + cross) in a flat buffer. All GC roots are children of a virtual root at DFN 0. Children are pushed to the stack unconditionally; duplicates are detected at pop time (DFN already assigned), deferring the objectIndex lookup to avoid an extra binary search per edge. 2. CSR predecessor list — two-pass conversion of the flat edge buffer into a Compressed Sparse Row structure (offsets + packed DFNs). Only this structure scales with edge count; all other arrays are O(nodes). 3. Lengauer-Tarjan Steps 2+3+4 — reverse-DFS-order computation of semi-dominators (Step 2) and tentative immediate dominators (Step 3) via intrusive singly-linked bucket lists; forward pass (Step 4) resolves deferred assignments. The link-eval compress function is iterative (not recursive) to avoid stack overflow on deep heap graphs. 4. Populate retained sizes — maps DFN results back to object IDs, populates a DominatorTree instance, and delegates to DominatorTree.buildFullDominatorTree. ## Memory design All large arrays use VarIntArray / VarLongArray (ByteArray-backed, variable bytes per entry) instead of IntArray / LongArray. For heaps with up to ~16 M objects bytesPerDfn = 3, saving 25% vs plain IntArray. The INVALID sentinel is the all-0xFF bit pattern, always > any valid DFN. HeapObject.objectIndex (a dense 0-based Int) is used as the node key in Phase 1, replacing the IdentityHashMap from Android Studio's generic implementation and avoiding the ~20 MB overhead of a LongLongScatterMap. Memory on a 193 MB heap dump (~983 K reachable objects, 2.5 M edges): ~48 MB peak during CSR construction (edge buffer + CSR arrays alive) ~43 MB during the L-T algorithm ~13 MB during retained-size computation (doms[] + id maps remain alongside DominatorTree.dominated; freed after buildFullDominatorTree) For a 50 MB dump (~350 K objects): ~16 MB peak. ## Changes - Add LinkEvalDominatorTree.kt with full L-T implementation - Add LinkEvalDominatorTreeTest: diamond-graph test proving correctness where DominatorTree gives the wrong answer (a.retainedSize=10 not 20) - DominatorTree: strip the convergence loop API (collectCrossEdges param, crossEdges field, runConvergenceLoop, pruneSettledCrossEdges) and update KDoc to document the approximation bug and point here - DominatorTreeTest: remove convergence loop tests; keep the regression test documenting the stale-dominator bug, and refine the test DSL - HprofInMemoryIndex.objectAtIndex: fix off-by-one (index > 0 → index >= 0) so the class object at index 0 resolves correctly - Update API dump Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 02d0d8b commit 785b68d

File tree

6 files changed

+785
-96
lines changed

6 files changed

+785
-96
lines changed

shark/shark-graph/src/main/java/shark/internal/HprofInMemoryIndex.kt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,7 @@ internal class HprofInMemoryIndex private constructor(
210210
}
211211

212212
fun objectAtIndex(index: Int): LongObjectPair<IndexedObject> {
213-
require(index > 0)
213+
require(index >= 0)
214214
if (index < classIndex.size) {
215215
val objectId = classIndex.keyAt(index)
216216
val array = classIndex.getAtIndex(index)

shark/shark/api/shark.api

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -583,6 +583,11 @@ public final class shark/LibraryLeakReferenceMatcher : shark/ReferenceMatcher {
583583
public fun toString ()Ljava/lang/String;
584584
}
585585

586+
public final class shark/LinkEvalDominatorTree {
587+
public fun <init> (Lshark/HeapGraph;Lshark/ReferenceReader$Factory;Lshark/GcRootProvider;)V
588+
public final fun compute (Lshark/DominatorTree$ObjectSizeCalculator;)Ljava/util/Map;
589+
}
590+
586591
public final class shark/MatchingGcRootProvider : shark/GcRootProvider {
587592
public fun <init> (Ljava/util/List;)V
588593
public fun provideGcRoots (Lshark/HeapGraph;)Lkotlin/sequences/Sequence;

shark/shark/src/main/java/shark/DominatorTree.kt

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,25 @@ import shark.internal.packedWith
1414
import shark.internal.unpackAsFirstInt
1515
import shark.internal.unpackAsSecondInt
1616

17-
class DominatorTree(expectedElements: Int = 4) {
17+
/**
18+
* Builds a dominator tree incrementally during a BFS traversal of the heap graph, using a
19+
* Lowest Common Ancestor (LCA) approach to update immediate dominators as new parents are
20+
* discovered.
21+
*
22+
* **Known limitation**: this is an approximation, not an exact dominator algorithm. It can
23+
* produce incorrect immediate dominators when a cross-edge (to an already-visited node) is
24+
* processed using a stale parent dominator, and that parent's dominator is later raised by
25+
* another cross-edge after the first cross-edge's children have already been settled. The child's
26+
* dominator is then left too specific (too far from root), so its retained size is
27+
* under-attributed — some objects that it truly retains exclusively are instead attributed to a
28+
* higher ancestor. See [updateDominated] for a concrete example and [DominatorTreeTest] for a
29+
* test documenting this behavior.
30+
*
31+
* For an exact dominator tree, use [LinkEvalDominatorTree] instead.
32+
*/
33+
class DominatorTree(
34+
expectedElements: Int = 4
35+
) {
1836

1937
fun interface ObjectSizeCalculator {
2038
fun computeSize(objectId: Long): Int
@@ -56,6 +74,24 @@ class DominatorTree(expectedElements: Int = 4) {
5674
* first search, so when objectId already has a known dominator then its dominator path is
5775
* shorter than the dominator path of [parentObjectId].
5876
*
77+
* **Known limitation**: the BFS assumption breaks down when a cross-edge (to an already-visited
78+
* node) is processed with a stale parent dominator. Consider:
79+
* ```
80+
* root → A → B → C (tree edges; B discovered first via A)
81+
* root → A → C (C also directly reachable from A)
82+
* root → D → E → B (E→B is a cross-edge at same BFS level as B)
83+
* ```
84+
* BFS dequeues B before E (both at level 2). When B is dequeued, B→C is processed as a
85+
* cross-edge with dom(B) = A (stale — E→B hasn't been processed yet). LCA(dom(C)=A, B) with
86+
* dom(B)=A returns A, so dom(C) stays A. Later, when E is dequeued, E→B is processed and
87+
* correctly raises dom(B) to root (root→D→E→B bypasses A). But dom(C) is never revisited:
88+
* it remains A, even though the path root→D→E→B→C bypasses A, making root the true idom(C).
89+
*
90+
* The consequence is that dom(C) is left too specific (too far from root), so C's retained
91+
* size is incorrectly attributed to A instead of root. A's retained size is over-reported.
92+
*
93+
* For an exact dominator tree that avoids this issue entirely, use [LinkEvalDominatorTree].
94+
*
5995
* @return true if [objectId] already had a known dominator, false otherwise.
6096
*/
6197
fun updateDominated(

0 commit comments

Comments
 (0)