-
Notifications
You must be signed in to change notification settings - Fork 37
RHAI-ENG-1550 - FAISS vector store integration docs #1046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
RHAI-ENG-1550 - FAISS vector store integration docs #1046
Conversation
WalkthroughThe changes introduce FAISS as an alternative vector database backend in Llama Stack documentation. A new overview module explains Inline FAISS functionality, and six existing documentation modules are updated to include FAISS deployment, ingestion, and querying examples alongside Milvus options. Package references are also updated from Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
assemblies/deploying-a-rag-stack-in-a-data-science-project.adoc(1 hunks)modules/deploying-a-llamastackdistribution-instance.adoc(2 hunks)modules/ingesting-content-into-a-llama-model.adoc(2 hunks)modules/overview-of-faiss-vector-databases.adoc(1 hunks)modules/overview-of-vector-databases.adoc(1 hunks)modules/preparing-documents-with-docling-for-llama-stack-retrieval.adoc(3 hunks)modules/querying-ingested-content-in-a-llama-model.adoc(6 hunks)
🔇 Additional comments (12)
modules/ingesting-content-into-a-llama-model.adoc (1)
112-126: Well-structured Option 3 FAISS inline addition.The new Option 3 FAISS block is properly formatted with consistent provider_id, environment variable naming, and confirmation output. It aligns well with the existing Milvus options and maintains the established pattern for vector store registration examples.
modules/overview-of-faiss-vector-databases.adoc (1)
1-21: Comprehensive and well-positioned overview document.The new FAISS overview clearly explains use cases, benefits, and production considerations. The NOTE block appropriately emphasizes that Inline FAISS is experimental/testing only, steering production workloads to Milvus. Structure and messaging align well with related documentation updates across the PR.
assemblies/deploying-a-rag-stack-in-a-data-science-project.adoc (1)
17-17: Clean assembly updates for FAISS integration.The generalized phrasing ("Ingest domain data into a vector database") decouples the assembly from specific database choices, and the offset adjustments (+1 → +2) establish consistent hierarchy for both Milvus and FAISS overview modules as sibling sections. Include order and structure are sound.
Also applies to: 22-23
modules/overview-of-vector-databases.adoc (1)
8-33: Clear and balanced vector database decision guidance.The three-option overview (Inline Milvus Lite, Inline FAISS, Remote Milvus) with explicit use-case guidance (lines 29–33) gives users clear decision points. The FAISS description (lines 20–21) accurately highlights SQLite persistence and zero network dependency, and the decision criteria appropriately guide users toward production-grade Remote Milvus for scaling while presenting FAISS as a valid testing path.
modules/preparing-documents-with-docling-for-llama-stack-retrieval.adoc (3)
46-46: Consistent package updates throughout file.References to
llama_stack_client(lines 46, 50, 53) and classes likeAgent,AgentEventLogger, andLlamaStackClientare updated consistently and correctly.Also applies to: 50-50, 53-53
118-137: Well-structured Option 3 FAISS addition.The new Option 3 block (lines 118–137) maintains consistency with Options 1 and 2, correctly uses
provider_id="faiss", and includes the confirmation print statement. Placement and formatting align with the established pattern.
174-217: Clear separation of RAG query methods in Verification section.The split into low-level RAG tool query (lines 174–206) and direct vector database query (lines 208–217) improves clarity and gives users both integration patterns. Both code examples are correct and well-commented.
modules/deploying-a-llamastackdistribution-instance.adoc (2)
6-10: Clear abstract revision with three deployment examples.The updated abstract (lines 6–10) clearly describes the three LlamaStackDistribution examples (Inline Milvus, Remote Milvus, Inline FAISS), setting expectations for the reader.
235-302: Comprehensive Example C for Inline FAISS deployment.Example C is well-structured with proper YAML formatting, clear environment variable setup, and comprehensive documentation. The FAISS_KVSTORE_DB_PATH configuration (lines 284–285) and supporting environment variables (lines 287–291) are clearly documented. The NOTE (lines 298–301) appropriately warns about experimental use and SQLite accessibility.
However, verify the CPU resource limit formatting:
- Example A (line 111) and Example B (line 188) use unquoted integers:
cpu: 4- Example C (line 256) uses a quoted string:
cpu: "8"Confirm whether the
"8"(quoted) vs4(unquoted) and the doubling of CPU limits for FAISS are intentional or should be aligned with Examples A and B for consistency.modules/querying-ingested-content-in-a-llama-model.adoc (3)
33-33: Consistent llama_stack_client references throughout.Package installation (line 33), imports (lines 40, 44) are all correctly updated to
llama_stack_client, maintaining consistency with PR objectives.Also applies to: 40-40, 44-44
126-140: Well-formatted Option 3 Inline FAISS addition.The new Option 3 FAISS block correctly uses
provider_id="faiss"and maintains consistency with Options 1 and 2. Vector database registration pattern is identical across all options.
151-195: Effective separation of low-level and high-level query patterns.Two complementary query approaches are now documented: low-level RAG tool (lines 151–162) for one-off lookups and high-level Agent API (lines 164–195) for conversational workflows. Both examples are correct, well-commented, and demonstrate practical usage patterns with proper session and streaming management.
| ==== | ||
| [NOTE] | ||
| Use inline Milvus Lite for development and small datasets. Persistence and scale are limited compared to remote Milvus. | ||
| ==== | ||
| ==== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix malformed AsciiDoc structure in Option 1.
Line 89 contains an extra ==== marker that breaks the AsciiDoc syntax. Compare Option 1 (lines 85–89) with Option 2 (lines 92–110): Option 2's NOTE block uses consistent [NOTE] / ==== / content / ==== formatting, but Option 1 has a dangling ==== on line 89 that will likely render incorrectly.
Apply this diff to fix the structure:
----
====
-[NOTE]
+[NOTE]
Use inline Milvus Lite for development and small datasets. Persistence and scale are limited compared to remote Milvus.
-====
-====
+====📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ==== | |
| [NOTE] | |
| Use inline Milvus Lite for development and small datasets. Persistence and scale are limited compared to remote Milvus. | |
| ==== | |
| ==== | |
| ==== | |
| [NOTE] | |
| Use inline Milvus Lite for development and small datasets. Persistence and scale are limited compared to remote Milvus. | |
| ==== |
🤖 Prompt for AI Agents
In modules/ingesting-content-into-a-llama-model.adoc around lines 85 to 89, the
NOTE block for Option 1 has a dangling closing "====" on line 89 that breaks
AsciiDoc structure; remove the extra "====" so the block uses the correct [NOTE]
/ ==== / content / ==== pairing (i.e., keep the initial "====", the note
content, and a single matching closing "====") to match Option 2's format.
Description
Documentation on FAISS vector store integration on ODH with Llama Stack.
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit