-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Open
Labels
bugSomething isn't workingSomething isn't workingtriageIssue needs to be triaged/prioritizedIssue needs to be triaged/prioritized
Description
Bug Description
When loading legacy node JSON via legacy_json_to_doc, the doc_id stored in the legacy payload is not preserved and a new UUID is generated instead. This would break the backward compatibility for users restoring or migrating persisted stores where stable node IDs are required (e.g., docstore lookups, relationship resolution, etc).
Version
llama-index-core 0.14.15
Steps to Reproduce
from llama_index.core.constants import DATA_KEY, TYPE_KEY
from llama_index.core.schema import Document
from llama_index.core.storage.docstore.utils import legacy_json_to_doc
doc_dict = {
TYPE_KEY: Document.get_type(),
DATA_KEY: {
"text": "hello",
"extra_info": {},
"doc_id": "doc-123",
"relationships": {},
},
}
loaded = legacy_json_to_doc(doc_dict)
print("expected:", "doc-123")
print("actual: ", loaded.id_)
assert loaded.id_ == "doc-123", "BUG: legacy loader lost persisted doc_id"Relevant Logs/Tracbacks
expected: doc-123
actual: b980379e-b733-49dd-8cfc-4e5bfed7fca5
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
/tmp/ipython-input-767892816.py in <cell line: 0>()
18 print("actual: ", loaded.id_)
19
---> 20 assert loaded.id_ == "doc-123", "BUG: legacy loader lost persisted doc_id"
AssertionError: BUG: legacy loader lost persisted doc_idReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingtriageIssue needs to be triaged/prioritizedIssue needs to be triaged/prioritized