forked from red-hat-data-services/agentic-starter-kits
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsample_knowledge.txt
More file actions
170 lines (129 loc) · 5.61 KB
/
sample_knowledge.txt
File metadata and controls
170 lines (129 loc) · 5.61 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
LangChain Framework
==================
LangChain is a comprehensive framework for developing applications powered by large language models (LLMs). It provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.
Key Features:
- Modular components for working with LLMs
- Pre-built chains for common use cases
- Memory systems for maintaining context
- Integration with various data sources
- Support for agents and tools
Common Use Cases:
- Chatbots and conversational AI
- Question answering over documents
- Text summarization and generation
- Code analysis and generation
- Data extraction and transformation
LangGraph Library
=================
LangGraph is a library for building stateful, multi-actor applications with LLMs. It extends LangChain Expression Language with the ability to coordinate multiple chains (or actors) across multiple steps of computation in a cyclic manner.
Key Concepts:
- State management for complex workflows
- Graph-based execution model
- Support for conditional edges and cycles
- Built-in persistence and checkpointing
- Streaming support for real-time updates
Use Cases:
- Multi-step reasoning workflows
- Agent-based systems
- Complex decision-making processes
- Interactive applications with branching logic
Retrieval-Augmented Generation (RAG)
====================================
RAG is a technique that combines information retrieval with text generation. It retrieves relevant documents from a knowledge base and uses them to provide context for generating informed, accurate responses.
How RAG Works:
1. User submits a query
2. System retrieves relevant documents from a vector database
3. Retrieved documents provide context to the LLM
4. LLM generates a response based on the context and query
Benefits:
- Reduces hallucinations by grounding responses in factual data
- Enables LLMs to access up-to-date information
- Cost-effective compared to fine-tuning
- Easier to maintain and update knowledge base
Components:
- Document loaders for ingesting data
- Text splitters for chunking documents
- Embedding models for creating vector representations
- Vector stores for efficient similarity search
- Retrieval algorithms for finding relevant documents
Vector Databases
================
Vector databases are specialized databases optimized for storing and searching high-dimensional vectors (embeddings). They enable efficient similarity search, which is crucial for RAG applications.
Popular Vector Databases:
- Milvus: Open-source vector database with horizontal scaling
- Pinecone: Managed vector database service
- Weaviate: Open-source vector search engine
- Chroma: Embeddings database for LLM applications
- Qdrant: Vector similarity search engine
Key Features:
- Fast similarity search using ANN algorithms
- Support for different distance metrics (cosine, euclidean, dot product)
- Filtering and metadata support
- Horizontal scalability
- Real-time indexing and updates
Milvus Vector Database
======================
Milvus is a high-performance, cloud-native vector database built for AI applications. Milvus Lite is a lightweight version designed for local development and testing.
Features:
- Supports multiple similarity metrics
- Hybrid search combining vector and scalar filtering
- Dynamic schema for flexible data modeling
- High availability and fault tolerance
- Integration with popular ML frameworks
Milvus Lite Benefits:
- No server required, runs in-process
- Perfect for development and testing
- Same API as full Milvus
- Easy migration to production Milvus
- Lightweight and fast
Embeddings and Semantic Search
==============================
Embeddings are dense vector representations of text that capture semantic meaning. Similar texts have similar embeddings, enabling semantic search.
Embedding Models:
- OpenAI text-embedding-3-small: Fast and efficient
- OpenAI text-embedding-3-large: Higher quality embeddings
- Sentence Transformers: Open-source sentence embeddings
- Cohere embeddings: Multilingual support
- Google Vertex AI embeddings: Enterprise-grade embeddings
Applications:
- Semantic search and information retrieval
- Document clustering and classification
- Duplicate detection
- Recommendation systems
- Anomaly detection
Agent Architectures
==================
Agents are LLM-powered systems that can use tools, make decisions, and take actions to accomplish tasks.
Common Agent Patterns:
- ReAct: Reasoning and Acting in an interleaved manner
- Plan-and-Execute: Planning before execution
- Reflexion: Learning from mistakes through self-reflection
- Tree of Thoughts: Exploring multiple reasoning paths
Agent Components:
- LLM as the reasoning engine
- Tools for interacting with external systems
- Memory for maintaining state
- Orchestration layer for managing execution flow
Best Practices for RAG Systems
==============================
1. Document Preparation:
- Clean and preprocess text
- Use appropriate chunk sizes (typically 500-1000 tokens)
- Include metadata for filtering
2. Embedding Strategy:
- Choose appropriate embedding model for your use case
- Consider multilingual requirements
- Batch embeddings for efficiency
3. Retrieval Configuration:
- Set appropriate number of retrieved documents (k=3-5)
- Use hybrid search when possible
- Implement re-ranking for better results
4. Prompt Engineering:
- Clear instructions for using context
- Handling cases with no relevant context
- Citation and source attribution
5. Monitoring and Evaluation:
- Track retrieval quality metrics
- Monitor generation quality
- Log and analyze user feedback
- A/B test different configurations