A modular, memory-enhanced Retrieval-Augmented Generation (RAG) chatbot built using LangChain, Groq’s blazing-fast LLaMA3 API, Chroma vector store, and HuggingFace sentence embeddings.
This project allows multi-turn conversations with context-aware question reformulation and history tracking using session-based memory — making it ideal for educational assistants, AI agents, and document-based Q&A.
| Feature | Description |
|---|---|
ChatPromptTemplate |
Modular prompt templates for structured messaging |
StrOutputParser |
Clean extraction of string responses from raw LLM outputs |
RunnableWithMessageHistory |
Maintains chat state across multiple LLM calls |
Chroma.from_documents |
Create vector store from document chunks with embeddings |
as_retriever() |
Convert vector store into retriever for use in RAG |
LCEL (operator) |
Expressive pipeline chaining of LangChain component |
create_history_aware_retriever |
Convert an updated retriever which remembers chat history |
- How to build a RAG chatbot with:
- Custom vector store retrieval
- Prompt engineering for Q&A
- Chain and agent-based memory integration
- How to maintain conversational context using
RunnableWithMessageHistory - How to scrape and chunk real web content for vector-based retrieval
- 🦜 LangChain: Core framework
- 🧠 Groq LLaMA3: High-performance LLM
- 🔍 ChromaDB: In-memory vector store
- 🧬 HuggingFace MiniLM Embeddings
- 🌐 WebBaseLoader: Web scraping from blog/article sources
- 🧱 RecursiveCharacterTextSplitter: Smart document chunking
- 💬 Session-based Chat History: Rehydration of past queries
git clone https://github.com/your-username/Conversational-QnA-Chatbot.git
cd Conversational-QnA-Chatbot
pip install -r requirements.txtCreate a .env file in the root directory with the following:
GROQ_API_KEY=your_groq_key
HF_TOKEN=your_huggingface_token# Step 1: Run once to initialize & ingest docs
python rag_chatbot.py
# Step 2: Start chatting with conversational memory!
conversational_rag_chain.invoke(
{"input": "What is Chain of Thought?"},
config={"configurable": {"session_id": "your-session-id"}}
)# First question
conversational_rag_chain.invoke(
{"input": "What is Chain of Thought?"},
config={"configurable": {"session_id": "abc456"}}
)
# Follow-up question with context memory
conversational_rag_chain.invoke(
{"input": "What are common ways of doing it?"},
config={"configurable": {"session_id": "abc456"}}
)Want to plug in your own documents or data sources?
Just change the source URL here:
loader = WebBaseLoader(
web_paths=("https://your-site.com/article",),
...
)Or replace with PDF/Text loader for local files.
RunnableWithMessageHistory(current)ConversationBufferMemoryZepMemory,ChromaMemory(advanced)
MIT License. Feel free to fork, extend, or deploy in your projects.
