Skip to content

docs: add RAG Failure Mode Checklist#20721

Open
ManasVardhan wants to merge 3 commits intorun-llama:mainfrom
ManasVardhan:rag-failure-mode-checklist
Open

docs: add RAG Failure Mode Checklist#20721
ManasVardhan wants to merge 3 commits intorun-llama:mainfrom
ManasVardhan:rag-failure-mode-checklist

Conversation

@ManasVardhan
Copy link

Summary

Adds a comprehensive RAG Failure Mode Checklist documentation page to help users diagnose and fix common RAG pipeline issues.

Failure Modes Covered

  1. Retrieval Hallucination — retriever returns superficially relevant but wrong chunks
  2. Wrong Chunk Selection (Poor Chunking) — critical context split across chunks
  3. Index Fragmentation — duplicate/outdated/conflicting documents in index
  4. Config Drift — embedding model mismatch between index and query time
  5. Embedding Model Mismatch — wrong model for the domain
  6. Context Window Overflow — too many chunks stuffed into LLM prompt
  7. Missing Metadata Filtering — retrieval not scoped to relevant subset
  8. Poor Query Understanding — ambiguous or short queries
  9. LLM Synthesis Failures — right chunks retrieved but bad answer generated

Each section includes symptoms and minimal fixes referencing LlamaIndex components. Also includes a quick diagnostic flowchart.

Closes #20702

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 17, 2026
Copy link

@dgenio dgenio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice doc! The failure mode structure (What happens / Symptoms / Fixes) is clear and practical. Two things to fix before merge:

  • All 3 "Further Reading" links use the wrong prefix — docs instead of python (the Astro site's base). Every other doc in optimizing/ uses /python/.... These will 404 on the live site.
  • The Evaluation link also points to a directory without an index file.

See inline comments for the specific fix. Once the links are corrected, this is ready to go.

Comment on lines 161 to 163
- [Building Performant RAG Applications for Production](/docs/framework/optimizing/production_rag/)
- [Building RAG from Scratch](/docs/framework/optimizing/building_rag_from_scratch/)
- [Evaluation Guide](/docs/framework/optimizing/evaluation/)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: All three links use docs prefix but the site base is python. Additionally, the Evaluation link targets a directory with no index file.

Impact: Every link in this section will 404 on the live docs site.

Suggestion: Replace with:

  • (/python/framework/optimizing/production_rag/)
  • (/python/framework/optimizing/building_rag_from_scratch/)
  • (/python/framework/optimizing/evaluation/evaluation/)

You can confirm the convention by checking any sibling doc (e.g., production_rag.md, basic_strategies.md) — all use /python/....

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: The doc references ~15 LlamaIndex classes/APIs by name (e.g., CohereRerank, SentenceTransformerRerank, HyDEQueryTransform, MetadataFilters) without linking to their module guide pages. Adding hyperlinks for a few key ones would make this more actionable for users — but fine to skip or tackle in a follow-up.

Signed-off-by: Manas Vardhan <manasvardhan@gmail.com>
Copy link

@dgenio dgenio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! The blocker (broken internal links) from the earlier pass is fixed. Content is clean and well-structured — nice work.

One non-blocking nit below re: legacy API URLs. Approve as-is.

Comment on lines 167 to 175
- [`SentenceSplitter`](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/sentence_splitter/)
- [`HierarchicalNodeParser`](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/hierarchical/)
- [`SentenceWindowNodeParser`](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/sentence_window/)
- [`CohereRerank`](https://docs.llamaindex.ai/en/stable/api_reference/postprocessor/cohere_rerank/)
- [`SentenceTransformerRerank`](https://docs.llamaindex.ai/en/stable/api_reference/postprocessor/sentence_transformer_rerank/)
- [`HyDEQueryTransform`](https://docs.llamaindex.ai/en/stable/api_reference/query/query_transform/hyde/)
- [`SubQuestionQueryEngine`](https://docs.llamaindex.ai/en/stable/api_reference/query/sub_question/)
- [`IngestionPipeline`](https://docs.llamaindex.ai/en/stable/api_reference/ingestion/pipeline/)
- [`MetadataFilters`](https://docs.llamaindex.ai/en/stable/api_reference/retrievers/vector_store/)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: The 9 class reference links use the old domain (docs.llamaindex.ai/en/stable/api_reference/...). These currently redirect to developers.llamaindex.ai/python/framework-api-reference/..., so they work, but they'd break if the redirect is ever dropped.

Consider updating the base URL to https://developers.llamaindex.ai/python/framework-api-reference/... for consistency and durability. E.g.:

  • https://developers.llamaindex.ai/python/framework-api-reference/node_parsers/sentence_splitter/
  • https://developers.llamaindex.ai/python/framework-api-reference/postprocessor/cohere_rerank/
  • etc.

Fine to handle in a follow-up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this is indeed blocking: I would prefer for the links to all be pointing to developers.llamaindex.ai

Copy link
Member

@AstraBert AstraBert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together! Just some minor comments!


**Fixes:**
- Try a **domain-adapted embedding model** (e.g., fine-tuned models for legal, medical, or code)
- Use LlamaIndex's `SentenceTransformersFinetuneEngine` to fine-tune embeddings on your own data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finetuning is definitely under-maintained in our framework, I would avoid mentioning it

- Answers are overly generic despite specific context being available

**Fixes:**
- Use a stronger LLM for synthesis (e.g., GPT-4 over GPT-3.5) or increase temperature slightly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably use slightly more recent models than GPT-4 and GPT-3.5 in this example

Comment on lines 167 to 175
- [`SentenceSplitter`](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/sentence_splitter/)
- [`HierarchicalNodeParser`](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/hierarchical/)
- [`SentenceWindowNodeParser`](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/sentence_window/)
- [`CohereRerank`](https://docs.llamaindex.ai/en/stable/api_reference/postprocessor/cohere_rerank/)
- [`SentenceTransformerRerank`](https://docs.llamaindex.ai/en/stable/api_reference/postprocessor/sentence_transformer_rerank/)
- [`HyDEQueryTransform`](https://docs.llamaindex.ai/en/stable/api_reference/query/query_transform/hyde/)
- [`SubQuestionQueryEngine`](https://docs.llamaindex.ai/en/stable/api_reference/query/sub_question/)
- [`IngestionPipeline`](https://docs.llamaindex.ai/en/stable/api_reference/ingestion/pipeline/)
- [`MetadataFilters`](https://docs.llamaindex.ai/en/stable/api_reference/retrievers/vector_store/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this is indeed blocking: I would prefer for the links to all be pointing to developers.llamaindex.ai

…etuning mention

- Update all API reference links from docs.llamaindex.ai to developers.llamaindex.ai
- Update model examples from GPT-4/GPT-3.5 to GPT-4o/GPT-4o-mini
- Remove SentenceTransformersFinetuneEngine mention per maintainer feedback

Signed-off-by: Manas Vardhan <manasvardhan@gmail.com>
@ManasVardhan
Copy link
Author

Addressed all feedback:

  • Updated API reference links to developers.llamaindex.ai
  • Removed SentenceTransformersFinetuneEngine mention
  • Updated model examples to GPT-4o/GPT-4o-mini

Thanks for the thorough reviews @dgenio @AstraBert!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feat]: Add a RAG failure mode checklist doc (symptoms to minimal fixes)

3 participants

Comments