Skip to content

[Upstream #145] 知识图谱中存在重复实体节点 #2

@ivanzud

Description

@ivanzud

Upstream Issue Mirror

Summary

问题描述 在使用 MiroFish 构建知识图谱时,Zep 会将同一现实实体识别为多个不同节点。 例如输入包含"特朗普"相关内容的文本后,图谱中会同时出现"特朗普"和 "美国总统特朗普"两个独立节点,它们各自有独立的边和关系。 这会导致: - 图谱中同一实体的信息被分散到多个节点上 - 后续的模拟推演基于不完整的实体关系进行,影响准确性 - 图谱可视化时出现冗余节点,影响可读性 ## 复现步骤 1. 准备一段包含同一人物/组织不同称呼的背景文本 2. 通过前端正常流程构建知识图谱 3. 查看生成的图谱,可以看到同一实体被拆分为多个节点 ## 截图 <img width="675" height="399" alt="Image" src="https://github.com/user-attachments/assets/593f4188-e766-46b3-9b88-25486…

Local Coverage

  • Status: partial
  • Summary: Repo-native partial mitigations are now landed locally across simulation inputs, backend graph/report/search/statistics/detail surfaces, raw graph introspection, node-edge introspection, textual tool output, both shipped graph renderers, and the visible frontend graph counters/logs: ZepEntityReader.filter_defined_entities() collapses obvious same-entity alias variants before simulation/profile generation, ZepEntityReader.get_entity_with_context() now merges alias-linked relations and related nodes for the entity-detail API, backend/app/services/graph_builder.py now collapses the same conservative alias pairs when serving /api/graph/data/<graph_id> and remaps duplicate edges to the retained node UUID, backend/app/services/zep_tools.py now collapses those aliases when building typed entity lists, raw node/edge introspection payloads, Panorama output, InsightForge entity/relationship summaries, QuickSearch/general search results, graph statistics, node-edge lookups, entity summaries including relations attached only to alias UUIDs, and NodeInfo.to_text() output, while preserving merged alias_names metadata so callers and downstream prompts can still see which labels were folded together. frontend/src/views/processGraphData.js and the shared frontend/src/components/GraphPanel.vue renderer now both collapse them while rendering graph data, the Process plus GraphPanel node detail drawers expose the folded non-canonical aliases via frontend/src/components/graphAliasDetails.js, and frontend/src/components/graphPanelData.js now drives deduplicated Step 1 / Process counters plus MainView refresh logs so title-prefixed duplicates such as 美国总统特朗普 vs 特朗普 no longer appear twice in the graph or its visible counts. Full graph-level persisted deduplication still remains tracked under beads issue mirofish-975 because upstream PR feat: add entity deduplication after graph building 666ghj/MiroFish#141 is not safe to cherry-pick wholesale.
  • Local refs: .beads/issues.jsonl, docs/upstream-triage.md, origin/mirror/upstream-pr-141, backend/app/services/zep_entity_reader.py, backend/tests/test_zep_entity_reader.py, backend/app/services/graph_builder.py, backend/tests/test_graph_builder.py, backend/app/services/zep_tools.py, backend/tests/test_zep_tools_dedup.py, backend/tests/test_zep_tools_i18n.py, frontend/src/components/GraphPanel.vue, frontend/src/components/graphAliasDetails.js, frontend/src/components/graphPanelData.js, frontend/src/components/Step1GraphBuild.vue, frontend/tests/graphAliasDetails.test.mjs, frontend/tests/graphPanelData.test.mjs, frontend/src/views/MainView.vue, frontend/src/views/Process.vue, frontend/src/views/processGraphData.js, frontend/tests/processGraphData.test.mjs
  • Notes: Mirrored into fork issue tracking on March 11, 2026 after enabling issues on ivanzud/MiroFish. 2026-03-11: Added a third repo-native partial mitigation in backend report/search tooling so conservative alias pairs are collapsed before Step 4/analysis surfaces render entity summaries and relationship chains. 2026-03-11: Added a fourth repo-native partial mitigation in backend graph-data responses so /api/graph/data/<graph_id> collapses the same obvious alias pairs and remaps duplicate edges without mutating stored Zep nodes. 2026-03-11: Added a fifth repo-native partial mitigation in QuickSearch/general search results so search_graph() and the local-search fallback collapse the same obvious alias pairs, deduplicate summary facts, and remap duplicate edges before report tooling or API callers consume the result payload. 2026-03-11: Added a seventh repo-native partial mitigation in backend graph statistics and entity-summary helpers so alias duplicates no longer inflate stats counts and alias queries resolve to the canonical merged entity payload. 2026-03-11: Added an eighth repo-native partial mitigation in backend entity summaries so alias-linked relations are collected from the full edge set before canonical remapping, which prevents summaries from dropping edges attached only to an alias UUID. 2026-03-11: Added a ninth repo-native partial mitigation in backend entity-detail reads so get_entity_with_context() resolves the requested node against its alias group, includes alias-linked relations, and deduplicates related-node payloads. 2026-03-11: Added a tenth repo-native partial mitigation in zep_tools node-edge lookups so get_node_edges() now resolves the requested node against its alias group and remaps duplicate edges instead of dropping relationships attached only to an alias UUID. 2026-03-11: Added an eleventh repo-native partial mitigation in raw zep_tools graph introspection so get_all_nodes() and get_all_edges() now collapse obvious alias duplicates and remap duplicate edge payloads before report-side callers consume the graph snapshot. 2026-03-11: Added a fourteenth repo-native partial mitigation in textual zep_tools node rendering so NodeInfo.to_text() now includes merged alias names with locale-aware labels, preserving the folded names in downstream report/runtime prompts. 2026-03-11: Added a fifteenth repo-native partial mitigation in the shared frontend graph renderer so frontend/src/components/GraphPanel.vue now reuses the conservative display-only alias-collapse mapper and no longer shows obvious duplicate entities outside the Process view. 2026-03-11: Added a sixteenth repo-native partial mitigation in the Process and shared GraphPanel node detail drawers so frontend/src/components/graphAliasDetails.js filters merged alias_names down to the folded non-canonical labels and both detail panels render them explicitly instead of hiding which source names collapsed into the canonical node. 2026-03-11: Added a seventeenth repo-native partial mitigation in frontend graph stats so frontend/src/components/graphPanelData.js now drives Step 1 / Process counters and MainView refresh logs through the same alias-collapse mapping as the visible graph renderer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions