Skip to content

darunnatarajan/rag_chat_bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š RAG Chat Bot

A simple Retrieval-Augmented Generation (RAG) chatbot using LangChain, HuggingFace embeddings, and the Groq LLM. It indexes local .txt files and answers questions based only on the content of those files.


πŸ“¦ Features

  • Loads .txt documents from a folder
  • Splits content into chunks
  • Generates vector embeddings using HuggingFace
  • Stores embeddings in memory
  • Uses Groq LLM for answering questions
  • Returns document-based answers (with optional fallback to LLM if needed)

πŸ—οΈ Project Structure

rag_chat_bot/
β”‚
β”œβ”€β”€ rag_demo.py           # Main script to run the RAG chatbot
β”œβ”€β”€ documents/            # Folder containing your .txt files
β”‚   β”œβ”€β”€ python_basics.txt
β”‚   β”œβ”€β”€ machine_learning.txt
β”‚   └── rag_technology.txt
β”œβ”€β”€ .venv/                # Optional: Your virtual environment
└── README.md             # This file

βš™οΈ Setup Instructions

1. Clone the repository

git clone https://github.com/darunnatarajan/rag_chat_bot.git
cd rag_chat_bot

2. Create a virtual environment and activate it

python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activate

3. Install dependencies

pip install -r requirements.txt

If you don’t have a requirements.txt, you can use:

pip install langchain langchain-huggingface huggingface-hub groq

πŸ”‘ Environment Variables

Create a .env file or set environment variables manually with your API key for Groq.

export GROQ_API_KEY="your-groq-api-key"

Or on Windows (PowerShell):

$env:GROQ_API_KEY = "your-groq-api-key"

πŸš€ Run the Chat Bot

python rag_demo.py

You will see:

==================================================
RAG System Ready! Ask your questions:
Available topics: Python, Machine Learning, RAG
==================================================

Example prompts:

  • What is Python?
  • Who created Python?
  • What does RAG stand for?

Type 'quit' to exit the chat.


🧠 How It Works

  1. Document Loading: Loads .txt files from the documents/ folder.

  2. Chunking: Breaks each file into manageable pieces.

  3. Embedding: Converts text chunks into vectors using HuggingFace embeddings.

  4. Vector Store: Stores those vectors in memory for fast retrieval.

  5. Querying: When you ask a question, it:

    • Retrieves the most relevant chunks
    • Passes them to the Groq LLM to generate an answer

πŸ“ Add Your Own Files

Place your .txt files into the documents/ folder. The bot will automatically load and index them on startup.


βœ… Example Output

Question: What is Python?
Answer: Python is a high-level programming language known for its simplicity and readability.

πŸ“Œ Notes

  • The LLM (Groq) may fall back to its own internal knowledge only if nothing is retrieved β€” this can be disabled if you want document-only answers.
  • You can customize chunk size, embedding model, or LLM settings in the script.

πŸ“„ License

MIT License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages