Fix #563: prevent rationale-node leakage and preserve calls direction by opnsrcntrbtr · Pull Request #576 · safishamsi/graphify

opnsrcntrbtr · 2026-04-27T18:20:23Z

Fix #563: prevent rationale-node leakage and preserve calls direction

Fixes #563.

Explain it like I'm 6 (and I just learned to code)

Imagine your code is a city, and graphify draws a map of it. Every function is a house, and an arrow from house A to house B means "A talks to B."

The map had two problems.

Problem 1 — The sticky-note houses.

Whenever you write a comment in your code (like """this function checks phone numbers"""), graphify also draws a little sticky-note house on the map for that comment. That part is fine.

But there was a bug: the program that draws arrows treated those sticky-notes like real houses. So if your file had a sticky-note saying "phone numbers must start with +1" and your file also said from queue import SQLiteQueue at the top, the map drew an arrow:

📝 sticky-note about phone numbers ➡️ 🏠 SQLiteQueue

That arrow is silly — a sticky-note can't "use" a house. But the bug drew thousands of these arrows. And then graphify would say "look how popular SQLiteQueue is, lots of things point to it!" — but most of the things pointing to it were sticky-notes, not real houses.

Fix: When picking which houses to draw arrows from, just check is this a sticky-note? if yes, skip it. Three lines of code, three places.

Problem 2 — The arrows pointed the wrong way.

When function A calls function B, the arrow should go A ➡️ B. graphify's drawing code knew this and got it right.

But then graphify uses a library called NetworkX to store the map, and the version of NetworkX storage being used is "two-way streets" — it doesn't remember which end of the arrow is the head. So when graphify saved the map to a file, the arrows came out pointing in random directions. Sometimes A ➡️ B, sometimes B ➡️ A.

The code already knew this would happen — it secretly wrote down the real direction on each arrow as little notes called _src (real start) and _tgt (real end). But when it was time to save the map, it forgot to read its own notes.

Fix: Before saving the map, read the secret notes and put the arrows back the right way. Then erase the notes (they were only for graphify's own use, no need to ship them in the file).

The result:

In a tiny test project I built (3 files, one class, a few functions calling each other):

The map went from 31 arrows to 24 arrows — the 7 silly sticky-note arrows are gone.
SQLiteQueue went from "12 arrows touching it" to "5 arrows touching it" — the rest were lies.
Every "A calls B" arrow now actually points from A to B.

And I added 5 new tests that will fail loudly if anyone accidentally re-introduces either bug later.

Summary

Two AST-extractor bugs were inflating god-node centrality and silently corrupting graph.json. Both are observable in the AST-only path (graphify update, no LLM required), so the regression test runs in <1s with no API key.

	Before	After
Edges in 3-file repro	31	24
Edges incident on `SQLiteQueue`	12	5
Phantom `rationale --uses--> X` edges	7	0
Inverted `calls` edges in repro	1	0
`_src` / `_tgt` keys leaking into `graph.json`	yes	no
Existing test suite	441 passed	446 passed (+5 new)

What was broken

Bug 1 — rationale leakage into cross-file resolvers

graphify/extract.py::_resolve_cross_file_imports indexed any node whose label didn't end in ) or .py as an "importable entity". Rationale nodes have IDs like <stem>_rationale_<line> and labels that are free-form docstring text, so they sailed through both filters and ended up in:

stem_to_entities — making them resolvable as "imported names"
local_classes — making them act as "importing classes" in the destination file

The same leak existed in extract()'s global_label_to_nid for cross-file call resolution, where a rationale label could collide with a callee name and produce a spurious calls edge.

Bug 2 — `calls` and `rationale_for` direction lost at export

graphify/export.py::to_json calls nx.json_graph.node_link_data() on a NetworkX undirected Graph. NetworkX may store edge endpoints in either canonical order, which silently flipped directional relations (calls, rationale_for) on serialization.

graphify/build.py::build_from_json already stashes the true endpoints in _src / _tgt precisely for this case — there's even a comment at line 100 explaining why — but the exporter never consulted them.

The reporter on issue #563 attributed this to the AST visitor, but the AST visitor is correct (every add_edge(func_nid, target_nid, "calls", ...) site at lines 1733, 2042, 2219, 2390, 2554, 2984, 3158 emits caller→callee). The corruption happens later, at JSON export.

What changed

`graphify/extract.py`

Three additions of node.get("file_type") != "rationale" to the indexing filters:

_resolve_cross_file_imports Pass 1 (line ~2616) — keep rationale nodes out of stem_to_entities
_resolve_cross_file_imports Pass 2 (line ~2628) — keep rationale nodes out of local_classes
extract() cross-file call resolution (line ~3354) — skip rationale nodes when building global_label_to_nid

`graphify/export.py`

In to_json (after node_link_data), restore each link's source / target from the preserved _src / _tgt and pop those internal fields so they don't leak into the published graph.json.

`tests/test_issue_563.py` (new, 240 lines)

Five regression tests:

test_no_rationale_nodes_in_uses_or_calls_edges — rationale nodes must only participate in rationale_for edges.
test_rationale_for_direction_is_rationale_to_parent — pins the direction the AST extractor actually emits, which the export bug was silently reversing.
test_calls_direction_is_caller_to_callee — pins the three exact directions reported as inverted in Two extractor bugs systematically inflate god-node centrality (rationale fragments + calls direction inversion) #563.
test_calls_direction_on_existing_sample_calls_fixture — adds a direction assertion on the in-tree tests/fixtures/sample_calls.py so the existing AST coverage gains direction safety.
test_src_tgt_metadata_not_leaked_into_graph_json — guards the export-side cleanup.

Reproduction (without this patch)

mkdir repro && cd repro

cat > jobq.py <<'PY'
"""Job-queue module."""
class SQLiteQueue:
    """Lightweight SQLite-backed queue."""
    def check_idempotency(self, key):
        """Return existing lead for key, or None."""
        return None
    def enqueue(self, payload):
        """Persist payload to the queue."""
        return True
PY

cat > contact_form.py <<'PY'
"""HTTP router for the contact form endpoint."""
from jobq import SQLiteQueue
q = SQLiteQueue()
def contact_form(payload):
    """Handle a contact form submission."""
    if q.check_idempotency(payload):
        return None
    q.enqueue(payload)
PY

cat > test_dedup.py <<'PY'
"""Tests for dedup."""
from jobq import SQLiteQueue
from contact_form import contact_form
def test_us_phone():
    """US phone normalization."""
def test_uk_phone():
    """UK phone with +44 prefix."""
def test_odoo_tag():
    """Tag write failure logged but doesn't stop chatter note."""
    SQLiteQueue().enqueue({})
    contact_form(None)
PY

graphify update .
python3 -c "
import json
g = json.load(open('graphify-out/graph.json'))
print('rationale-source uses edges:',
      sum(1 for e in g['links'] if 'rationale' in e['source'] and e['relation'] == 'uses'))
print('SQLiteQueue degree:',
      sum(1 for e in g['links']
          if 'sqlitequeue' in e['source'] or 'sqlitequeue' in e['target']))
"

On v0.5.0 (default branch) this reports 7 phantom uses edges and 12 incident on SQLiteQueue. After this patch, both numbers drop to 0 and 5 respectively.

Note on the issue's framing

The reporter in #563 described Bug 2 as an AST visitor inversion (e.g., "the source should be the function/method containing the call site"). The AST visitor is in fact already correct in main — the inversion is at the JSON export layer. I've kept the reporter's example assertions (e.g., contact_form() --calls--> queue.check_idempotency()) as the assertions in the regression test, since those are the user-visible directions they expected to hold and which now do.

The reporter also described Bug 1 as SQLiteQueue --uses--> rationale_node (rationale as target). On reproducing, the actual direction was rationale_node --uses--> SQLiteQueue. The fix collapses the same root cause either way: rationale nodes were participating in cross-file resolution at all.

Out of scope

The skill prompt files (graphify/skill*.md) instruct LLM subagents to use rationale_for already; subagent compliance is a separate, harder problem (the reporter's audit shows 0/350 rationale-targeted edges using rationale_for in their LLM-extracted graph). This PR focuses on the deterministic AST path, where the bugs are unambiguous and the fix is small. A follow-up PR can add a defensive sanitizer over LLM-emitted edges that rewrites uses → rationale_for and drops anything pointing at a node whose file_type == "rationale".

… direction Two AST-extractor bugs were inflating god-node centrality: 1. Rationale leakage in cross-file resolvers _resolve_cross_file_imports indexed any node whose label didn't end in ')' or '.py' as an importable entity, so rationale nodes (whose labels are docstring text) ended up in both stem_to_entities and local_classes. Result: phantom '<file>_rationale_N --uses--> ImportedThing' edges for every imported entity in every file with docstrings. The same leak existed in cross-file call resolution via global_label_to_nid, where rationale labels could collide with callee names. Fix: skip nodes with file_type == 'rationale' in all three index loops. 2. Calls / rationale_for direction lost at export to_json() called nx.json_graph.node_link_data() on a NetworkX undirected Graph, which canonicalizes endpoint order. The build path already stashes _src and _tgt on every edge for exactly this case (with a comment to that effect at build.py:100), but the exporter never consulted them, so 'calls' and 'rationale_for' edges were emitted with arbitrary direction. Fix: in to_json(), restore link['source'] / link['target'] from _src / _tgt, then pop those internal fields before writing graph.json. Repro on a minimal 3-file project (queue.py + contact_form.py + a test file with multi-paragraph docstrings) showed: Before fix: 31 edges, 12 incident on SQLiteQueue, 7 phantom uses-edges from rationale nodes, 1 inverted calls edge. After fix: 24 edges, 5 incident on SQLiteQueue, 0 phantom edges, all calls edges go caller -> callee. Adds tests/test_issue_563.py with 5 regression tests covering both bugs plus a direction-pin on the existing sample_calls.py fixture. Full suite passes (446/446).

Copilot

Pull request overview

Fixes issue #563 by preventing rationale nodes (docstring/comment fragments) from being treated as import/call participants, and by restoring directional edge endpoints during JSON export so calls/rationale_for directions are preserved.

Changes:

Exclude file_type == "rationale" nodes from cross-file import indexing and cross-file call label resolution.
In to_json, restore link["source"]/link["target"] from preserved _src/_tgt and strip those internal fields from graph.json.
Add a dedicated regression test suite covering rationale leakage, calls direction, rationale_for direction, and _src/_tgt non-leakage.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`tests/test_issue_563.py`	Adds regression tests validating rationale-node isolation, edge direction correctness, and `_src/_tgt` cleanup.
`graphify/extract.py`	Filters out rationale nodes from cross-file resolvers to prevent phantom `uses`/`calls` edges.
`graphify/export.py`	Restores directional endpoints from `_src/_tgt` during JSON serialization and removes internal metadata keys.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T18:27:11Z

+        e for e in repro_graph["links"]
+        if ("rationale" in e["source"] or "rationale" in e["target"])
+        and e["relation"] != "rationale_for"
+    ]


The tests detect rationale nodes by checking whether the edge endpoint IDs contain the substring "rationale". This couples the regression tests to the current ID-generation scheme and can produce false negatives (if rationale IDs ever change) or false positives (if a legitimate entity ID contains "rationale"). Consider building an id→file_type map from repro_graph["nodes"] and using that to identify rationale endpoints instead.

@copilot apply changes based on this feedback

Copilot · 2026-04-27T18:27:12Z

+        # Restore original edge direction. Undirected NetworkX storage may
+        # canonicalize endpoint order, flipping `calls` and other directional
+        # edges in graph.json. The build path stashes the true endpoints in
+        # _src/_tgt for exactly this purpose (#563).
+        true_src = link.pop("_src", None)
+        true_tgt = link.pop("_tgt", None)
+        if true_src is not None and true_tgt is not None:
+            link["source"] = true_src
+            link["target"] = true_tgt


Edge-direction is restored for graph.json here, but graph.html generation still iterates for u, v, data in G.edges(data=True) and uses from=u, to=v (export.py around the to_html edge loop). On an undirected Graph that can still flip directional relations in the HTML visualization (since arrows are enabled). Consider using the preserved _src/_tgt there as well (or disabling arrows for undirected relations) so calls/rationale_for direction is consistent across exports.

@copilot apply changes based on this feedback

opnsrcntrbtr · 2026-04-27T18:30:39Z

@copilot apply changes based on the comments in this thread

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T18:37:17Z

+    bad = [
+        e for e in repro_graph["links"]
+        if ("rationale" in e["source"] or "rationale" in e["target"])
+        and e["relation"] != "rationale_for"
+    ]


The test identifies rationale nodes by checking whether endpoint IDs contain the substring "rationale". This couples the regression to the current ID scheme and can miss/false-trigger if IDs change. Prefer deriving an id→file_type map from repro_graph["nodes"] and checking endpoints’ file_type == "rationale" instead.

Copilot · 2026-04-27T18:37:17Z

+    rats = [e for e in repro_graph["links"] if e["relation"] == "rationale_for"]
+    assert rats, "expected some rationale_for edges in the repro"
+    bad = [e for e in rats if "rationale" not in e["source"]]
+    assert bad == [], (


Same issue here: determining rationale endpoints via "rationale" substring in the ID is brittle. Use the node metadata (file_type == "rationale") from repro_graph["nodes"] to assert rationale_for direction instead of relying on naming conventions.

Copilot · 2026-04-27T18:37:17Z

+            # Only index real classes/functions (not file nodes, not method stubs,
+            # and not rationale nodes whose labels are free-form docstring text — #563)
+            if (
+                label
+                and not label.endswith((")", ".py"))
+                and "_" not in label[:1]
+                and node.get("file_type") != "rationale"
+            ):


The comment says this indexes “classes/functions”, but the filter excludes any label ending with ")". In the generic extractor, function labels are formatted as name() (see graphify/extract.py around where nodes are created), so functions will never be indexed here. Either update the comment/docstring to reflect that only class-like nodes are indexed, or normalize labels (e.g., strip trailing "()") so imported functions can be resolved as intended.

Copilot · 2026-04-27T18:37:18Z

+    # contact_form() calls SQLiteQueue.check_idempotency() and SQLiteQueue.enqueue()
+    assert (
+        "contact_form_contact_form",
+        "jobq_sqlitequeue_check_idempotency",
+    ) in calls, f"expected caller->callee edge missing. got: {calls}"
+
+    # The test function calls contact_form()
+    assert (
+        "test_customer_dedup_test_odoo_tag_missing_still_chatter",
+        "contact_form_contact_form",
+    ) in calls, f"test->contact_form edge missing. got: {calls}"
+


This test claims contact_form() calls both check_idempotency() and enqueue(), but it only asserts the presence of the check_idempotency edge. If the enqueue call edge stops being emitted, the test would still pass as long as the inverted edge is also absent. Add an explicit assertion that (contact_form_contact_form -> jobq_sqlitequeue_enqueue) exists to fully pin the reported regression surface from #563.

- tests: identify rationale nodes by file_type metadata, not id substring (decouples regression from current `<stem>_rationale_<line>` naming) - tests: assert contact_form -> SQLiteQueue.enqueue calls edge directly (previously only inversions were guarded; a dropped enqueue edge would have passed silently) - extract.py: clarify cross-file import indexing comment — the filter intentionally indexes only class-level entities (function/method labels end in "()" and are excluded by design)

to_json was patched to consult _src/_tgt before serializing edge endpoints, but to_html still read endpoints directly from G.edges(), so calls and rationale_for arrows in the rendered graph.html still pointed in canonicalized (often inverted) order. Apply the same direction restore in to_html when building vis.js edges. Use data.get (not pop) since G.edges(data=True) yields the live attribute dict and other exporters may run after to_html. Adds test_to_html_preserves_calls_and_rationale_for_direction: parses RAW_EDGES out of the rendered HTML and pins both the caller->callee assertions from safishamsi#563 and the rationale->parent direction. Without the fix, the test reports 11 flipped arrows on the 3-file repro.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T19:17:35Z

+    (root / "jobq.py").write_text(textwrap.dedent('''
+        """Job-queue module — provides SQLiteQueue for idempotent job tracking."""
+


Path.write_text() defaults to the platform’s locale encoding. This fixture string includes a Unicode em dash (—), so on Windows this can write non‑UTF‑8 bytes that tree-sitter (expects UTF‑8) may parse incorrectly. Specify encoding="utf-8" (and ideally newline="\n") for the repro files to keep the test deterministic across platforms.

Copilot · 2026-04-27T19:17:35Z

+    (root / "contact_form.py").write_text(textwrap.dedent('''
+        """HTTP router for the contact form endpoint."""
+        from jobq import SQLiteQueue
+


Same portability issue here: use encoding="utf-8" on this write_text() call (the content includes Unicode characters). Otherwise the file may be written in a locale encoding and break tree-sitter parsing on Windows.

Copilot · 2026-04-27T19:17:36Z

+    (root / "test_customer_dedup.py").write_text(textwrap.dedent('''
+        """Tests for customer dedup — exercises phone normalization and Odoo tagging."""
+        from jobq import SQLiteQueue
+        from contact_form import contact_form


Same issue as above: add encoding="utf-8" to this write_text() call to avoid locale-dependent encodings (especially since the docstring contains an em dash).

Copilot · 2026-04-27T19:17:36Z

+    )
+    graph_path = root / "graphify-out" / "graph.json"
+    assert graph_path.exists(), f"graph.json not produced. stderr: {result.stderr}"
+    return json.loads(graph_path.read_text())


graph_path.read_text() uses the default locale encoding. Since graph.json is written as UTF‑8, this can fail or misdecode on non‑UTF‑8 locales. Read with encoding="utf-8" for cross-platform reliability.

Suggested change

return json.loads(graph_path.read_text())

return json.loads(graph_path.read_text(encoding="utf-8"))

Copilot AI review requested due to automatic review settings April 27, 2026 18:20

Copilot started reviewing on behalf of opnsrcntrbtr April 27, 2026 18:20 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

opnsrcntrbtr requested a review from Copilot April 27, 2026 18:32

Copilot started reviewing on behalf of opnsrcntrbtr April 27, 2026 18:32 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

opnsrcntrbtr added 2 commits April 28, 2026 00:23

opnsrcntrbtr requested a review from Copilot April 27, 2026 19:14

Copilot started reviewing on behalf of opnsrcntrbtr April 27, 2026 19:14 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

		(root / "jobq.py").write_text(textwrap.dedent('''
		"""Job-queue module — provides SQLiteQueue for idempotent job tracking."""

	return json.loads(graph_path.read_text())
	return json.loads(graph_path.read_text(encoding="utf-8"))

Uh oh!

Conversation

opnsrcntrbtr commented Apr 27, 2026

Fix #563: prevent rationale-node leakage and preserve calls direction

Explain it like I'm 6 (and I just learned to code)

Summary

What was broken

Bug 1 — rationale leakage into cross-file resolvers

Bug 2 — calls and rationale_for direction lost at export

What changed

graphify/extract.py

graphify/export.py

tests/test_issue_563.py (new, 240 lines)

Reproduction (without this patch)

Note on the issue's framing

Out of scope

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

opnsrcntrbtr Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

opnsrcntrbtr Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

opnsrcntrbtr commented Apr 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Bug 2 — `calls` and `rationale_for` direction lost at export

`graphify/extract.py`

`graphify/export.py`

`tests/test_issue_563.py` (new, 240 lines)