Skip to content

Commit e1424f6

Browse files
committed
fix: merge C# stub nodes with real cross-language definitions
When C# code inherits from a type defined in F# (e.g. SqliteBookStore extends BookStore from Interfaces.fs), the C# extractor creates a stub node with an empty source_file. This stub disconnects the inheritance edge from the real F# definition. Add a post-extraction pass that merges stub nodes into real definitions by matching labels. Prioritize definition files (Interfaces.fs, Domain.fs, Types.fs) so inherits edges point to abstract types rather than implementation classes. Made-with: Cursor
1 parent 5843ffc commit e1424f6

1 file changed

Lines changed: 39 additions & 0 deletions

File tree

graphify/extract.py

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3342,6 +3342,45 @@ def extract(paths: list[Path], cache_root: Path | None = None) -> dict:
33423342
import logging
33433343
logging.getLogger(__name__).warning("Java cross-file import resolution failed, skipping: %s", exc)
33443344

3345+
# ── Cross-language node merge ─────────────────────────────────────────────
3346+
# C# extractors create stub nodes (empty source_file) for base types that
3347+
# may actually be defined in F# files (or other C# files). Merge stubs
3348+
# into real definitions so edges point to the canonical node.
3349+
#
3350+
# Priority: prefer nodes from definition files (Interfaces.fs, Domain.fs)
3351+
# over implementation files, so inherits edges point to abstract types.
3352+
_DEFINITION_FILES = {"interfaces", "domain", "types", "contracts", "abstractions"}
3353+
3354+
real_by_label: dict[str, str] = {}
3355+
for n in all_nodes:
3356+
sf = n.get("source_file", "")
3357+
if sf:
3358+
lbl = n["label"].strip("()").lower()
3359+
stem_lower = Path(sf).stem.lower()
3360+
existing = real_by_label.get(lbl)
3361+
if existing is None:
3362+
real_by_label[lbl] = n["id"]
3363+
elif stem_lower in _DEFINITION_FILES:
3364+
real_by_label[lbl] = n["id"]
3365+
3366+
stub_ids: set[str] = set()
3367+
stub_to_real: dict[str, str] = {}
3368+
for n in all_nodes:
3369+
if not n.get("source_file"):
3370+
lbl = n["label"].strip("()").lower()
3371+
real_nid = real_by_label.get(lbl)
3372+
if real_nid and real_nid != n["id"]:
3373+
stub_to_real[n["id"]] = real_nid
3374+
stub_ids.add(n["id"])
3375+
3376+
if stub_to_real:
3377+
all_nodes = [n for n in all_nodes if n["id"] not in stub_ids]
3378+
for e in all_edges:
3379+
if e["source"] in stub_to_real:
3380+
e["source"] = stub_to_real[e["source"]]
3381+
if e["target"] in stub_to_real:
3382+
e["target"] = stub_to_real[e["target"]]
3383+
33453384
# Cross-file call resolution for all languages
33463385
# Each extractor saved unresolved calls in raw_calls. Now that we have all
33473386
# nodes from all files, resolve any callee that exists in another file.

0 commit comments

Comments
 (0)