Skip to content

feat: rank impact radius by weighted best-path score#606

Open
SHudici wants to merge 1 commit into
tirth8205:mainfrom
SHudici:feat/weighted-impact-radius
Open

feat: rank impact radius by weighted best-path score#606
SHudici wants to merge 1 commit into
tirth8205:mainfrom
SHudici:feat/weighted-impact-radius

Conversation

@SHudici

@SHudici SHudici commented Jul 4, 2026

Copy link
Copy Markdown

What

get_impact_radius treats every edge and every depth equally: a depth-2 IMPORTS_FROM hop counts the same as a depth-1 CALLS hop, and when the reachable set exceeds max_nodes, LIMIT truncates in arbitrary scan order. On a real ~4.8k-node / 39k-edge graph, a single-module change returned 500 flat nodes whose top "key entities" were alphabetically-first shell scripts — the most relevant nodes had no priority over the least.

This PR ranks the blast radius by a weighted best-path score so truncation keeps the highest-signal nodes and consumers can see what is most at risk, not just everything within N hops.

How

Scoring. Each reached node gets score = max over paths of Π(edge_weight × 0.6 decay) per hop. Edge-kind weights live in constants.IMPACT_EDGE_WEIGHTS (CALLS 1.0, INHERITS/OVERRIDES/IMPLEMENTS 0.9, TESTED_BY 0.7, REFERENCES/DEPENDS_ON 0.6, IMPORTS_FROM 0.5, CONTAINS 0.3; unknown kinds 0.5). Paths whose score falls below IMPACT_SCORE_FLOOR (0.05) stop expanding. Decay and floor are env-overridable (CRG_IMPACT_DEPTH_DECAY, CRG_IMPACT_SCORE_FLOOR). These weights follow the precedent of communities.EDGE_WEIGHTS but model review-risk propagation rather than clustering affinity, so the values intentionally differ.

SQL engine. The recursive CTE carries score, joins a small _impact_weights temp table per hop, and applies the floor in the recursion guard. SQLite cannot aggregate inside the recursive term, so a node may be revisited once per distinct path score; the depth guard + floor bound the expansion and the outer GROUP BY keeps MAX(score). The final select is ORDER BY impact_score DESC before LIMIT, so truncation is best-first. Two related fixes fell out of measuring this on the real graph:

  • Ghost endpoints no longer eat LIMIT slots. Edge endpoints with no nodes row stay in the recursion as traversal bridges but are excluded from the final selection. Before, they consumed up to ~8% of the LIMIT and were then silently dropped by _batch_get_nodes.
  • truncated is now honest under LIMIT saturation. Previously the flag compared post-LIMIT counts and could never fire on the SQL path; it now reports saturation (>= max_nodes, i.e. "there may be more beyond the cutoff").

NetworkX engine (legacy, CRG_BFS_ENGINE=networkx). Implements the same scoring with better-path revisits — a node re-enters the frontier when a better-scoring path reaches it, because a deep CALLS chain can outscore a shallow CONTAINS hop. _build_networkx_graph now keeps the strongest kind when collapsing parallel edges between the same pair (DiGraph holds one edge per pair; previously "last row wins" arbitrarily). The only other consumer of that graph, betweenness centrality in analysis.py, ignores kind entirely. On the production graph both engines return identical node sets and identical scores (max diff 0.0).

Surface (additive). The store result gains impact_scores (qualified_name → best-path score); impacted_nodes come back best-first; the get_impact_radius tool attaches impact_score to each node dict, and minimal-detail key_entities become the top-scored nodes instead of the first five in scan order. No existing keys change shape.

Measured on a real graph (4,757 nodes / 39,386 edges)

before after
top of results alphabetical .claude/hooks/*.sh direct CALLS neighbors (score 0.6)
shell scripts in top 20 5 of 5 key_entities 0
SQL query time (depth 2, 500 nodes) ~50 ms
NetworkX time ~140 ms
engine parity sets only sets and scores identical

Testing

  • New TestWeightedImpactScoring (9 tests): kind weighting, per-hop decay, best-first truncation, deeper-strong-path-beats-shallow-weak-path, floor cutoff, SQL↔NetworkX score parity, parallel-edge-kind collapse regression, unknown-kind default, empty-input shape.
  • New TestImpactRadiusScoring (tool layer, 2 tests): impact_score attached and sorted best-first; minimal key_entities lead with the top-scored node.
  • Existing test_sql_matches_networkx, truncation, and empty-input tests pass unchanged.
  • Full suite on Windows: 1,339 passed; the 6 failures + 228 teardown errors pre-exist on upstream/main (verified on a clean checkout) and are addressed by fix: escape backslashes, quotes, and control chars in daemon TOML serialization #595fix: request SYNCHRONIZE access in the Windows PID liveness check #597.

Notes / non-goals

  • detect_changes and get_review_context could consume impact_score for better review priorities — deliberately left for a follow-up so this PR stays reviewable.
  • Weight values are initial calibrations pinned by tests; happy to tune if you have preferences.
  • Known trade-off: because SQLite cannot aggregate inside the recursive term, a node can be revisited once per distinct path score before the outer GROUP BY collapses it. The depth guard + score floor bound this; on the 39k-edge measurement graph the full query is ~50 ms. A pathological dense cyclic subgraph with a raised CRG_MAX_IMPACT_DEPTH would pay more — the floor is the safety valve.
  • Both engines now report truncated with the same saturation semantics (>= max_nodes, meaning "the result is full; more may exist"). At the exact boundary this can report True with nothing actually dropped — chosen over the previous behavior where the SQL flag could never fire at all.
  • The _impact_seeds/_impact_weights TEMP tables are per-connection state; the tool layer opens a fresh store per call, matching the pre-existing _impact_seeds pattern.

get_impact_radius treated every edge and depth equally: a depth-2
IMPORTS_FROM hop counted the same as a depth-1 CALLS hop, and when the
result exceeded max_nodes, truncation kept arbitrary scan-order nodes.
On a ~4.8k-node production graph a single-module change returned 500
flat nodes whose "key entities" were alphabetical shell scripts.

Each reached node now gets a score: the best path from any seed, where
every hop multiplies by an edge-kind weight (IMPACT_EDGE_WEIGHTS: CALLS
1.0 down to CONTAINS 0.3, unknown kinds 0.5) and a per-hop decay (0.6).
Paths whose score falls below IMPACT_SCORE_FLOOR (0.05) stop expanding.
impacted_nodes come back ordered best-first, truncation keeps the
highest-signal nodes, and a new additive impact_scores map (plus
impact_score on each node dict at the tool layer) exposes the ranking.

SQLite cannot aggregate inside a recursive CTE, so the recursion may
revisit a node once per distinct path score; the depth guard plus the
score floor bound the expansion and the outer GROUP BY keeps MAX(score).
The legacy NetworkX engine implements the same scoring (with
better-path revisits, since a deep CALLS chain can outscore a shallow
CONTAINS hop) and stays set- and score-aligned with the SQL engine.
Weights follow the precedent of communities.EDGE_WEIGHTS but model
review-risk propagation rather than clustering affinity, so the values
intentionally differ.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant