Skip to content

Commit 9fa183e

Browse files
authored
[CAI-718] Fix doc id when adding missing folders to vector index (#1963)
* Fix getting refenrece folder dirnames that belongs to static documents * Fix getting refenrece folder dirnames that belongs to static documents * Update schedule time to 22:00 for the workflow * Add changeset
1 parent 16a13c2 commit 9fa183e

File tree

3 files changed

+15
-4
lines changed

3 files changed

+15
-4
lines changed

.changeset/floppy-ears-stare.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"chatbot-index": patch
3+
---
4+
5+
Fix to consider only docIDs that belong to static docs when adding missing folders md files to vector index

.github/workflows/chatbot_refresh_vector_index.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ on:
2525
- add missing static
2626

2727
schedule:
28-
- cron: '0 1 * * *' # Run daily at 01:00 AM UTC (03:00 AM CEST) to refresh the index in prod
28+
- cron: '0 22 * * *' # Run daily at 10:00 PM UTC (11:00 PM CEST) to refresh the index in prod
2929

3030
permissions:
3131
id-token: write

apps/chatbot-index/src/modules/add_missing_static_docs.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
get_one_metadata_from_s3,
44
get_folders_list,
55
StaticMetadata,
6+
DOCS_PARENT_FOLDER,
67
)
78
from src.modules.vector_index import DiscoveryVectorIndex
89
from src.modules.settings import SETTINGS
@@ -24,7 +25,9 @@
2425
index = VECTOR_INDEX.get_index()
2526
ref_doc_info = index.storage_context.docstore.get_all_ref_doc_info()
2627
ref_doc_ids = list(ref_doc_info.keys())
27-
ref_folders = [doc_id.split("/")[2] for doc_id in ref_doc_ids]
28+
ref_folders = [
29+
doc_id.split("/")[2] for doc_id in ref_doc_ids if DOCS_PARENT_FOLDER in doc_id
30+
]
2831
ref_folders = list(set(ref_folders))
2932

3033
static_docs_to_add = []
@@ -57,6 +60,9 @@
5760
folders_to_remove.append(ref_folder)
5861

5962
if index:
60-
VECTOR_INDEX.refresh_index_static_docs(index, static_docs_to_add, [])
61-
VECTOR_INDEX.remove_docs_in_folder(index, folders_to_remove)
63+
if static_docs_to_add:
64+
VECTOR_INDEX.refresh_index_static_docs(index, static_docs_to_add, [])
65+
if folders_to_remove:
66+
for folder_to_remove in folders_to_remove:
67+
VECTOR_INDEX.remove_docs_in_folder(index, folder_to_remove)
6268
LOGGER.info("Static docs refresh process completed.")

0 commit comments

Comments
 (0)