README.md

RAG Pipeline with LangChain and Google Generative AI

This project demonstrates a Retrieval-Augmented Generation (RAG) pipeline using LangChain, ChromaDB, and Google Generative AI. It scrapes course content from a website, splits the text, embeds it, and enables question-answering over the retrieved context.

Features

Web scraping using LangChain Community's WebBaseLoader
Document splitting with RecursiveCharacterTextSplitter
Embedding generation via Google Generative AI
Vector storage and retrieval using ChromaDB
RAG pipeline for concise question answering
Prompt debugging with custom print function

Setup

Install dependencies:

pip install langchain_community langchainhub chromadb langchain langchain-google-genai langchain-openai

Set your Google API key:
- If using Google Colab, store your key in Colab's userdata.
- Otherwise, set GOOGLE_API_KEY in your environment.

Usage

Open RAG.ipynb and run the cells sequentially:

Scrape course data from https://www.educosys.com/course/genai
Split and embed documents
Store embeddings in ChromaDB
Query the RAG pipeline with your questions

Example

rag_chain.invoke("Is there any free courses?")

Project Structure

RAG.ipynb: Main notebook containing all code for scraping, embedding, and RAG pipeline.

License

This project is for

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
RAG.ipynb		RAG.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

README.md

RAG Pipeline with LangChain and Google Generative AI

Features

Setup

Usage

Example

Project Structure

License

About

Uh oh!

Releases

Packages

Languages

AdityaGaur7/WebScrap-Rag

Folders and files

Latest commit

History

Repository files navigation

README.md

RAG Pipeline with LangChain and Google Generative AI

Features

Setup

Usage

Example

Project Structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages