Skip to content

Commit 1989f50

Browse files
committed
fix(kb): scope generation lock by doc_id/parse_hash and surface embed cleanup failures (PR #202)
Replace source_path ingestion locks with cross-process generation_ingestion_lock keyed by (collection, doc_id, parse_hash, user scope), held through chunk replace and embedding writes. Raise DatabaseOperationError when embeddings_* cascade delete fails. Add regression tests for empty replace_scope, partial cleanup failures, and per-user lock isolation. Refs: #199, #438
1 parent 376dbeb commit 1989f50

7 files changed

Lines changed: 877 additions & 528 deletions

File tree

src/xagent/core/tools/core/RAG_tools/chunk/chunk_document.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -88,10 +88,10 @@ def chunk_document(
8888
Chunk parsed paragraphs and write to chunks table.
8989
9090
Concurrency note:
91-
This function uses **last-write-wins** semantics. The ingestion
92-
pipeline serialises calls per (collection, source_path) via
93-
``_get_ingestion_lock``, so concurrent chunk replacement races are
94-
prevented at the pipeline level.
91+
Chunk replacement is serialised per
92+
``(collection, doc_id, parse_hash, user scope)`` via
93+
``generation_ingestion_lock`` (in ``replace_chunks`` and the ingestion
94+
pipeline through embedding writes).
9595
9696
Args:
9797
collection: Collection name for data isolation

0 commit comments

Comments
 (0)