Skip to content

VarDaan.ai is a chatbot interface allow online entities—like blogs, articles, or youtube videos—into an interactive chatbot. Users can simply pass interested entity's URL, and the app processes the content to create a chatbot that answers queries based on the material, allowing for easy interaction without reading or watching the full content.

Notifications You must be signed in to change notification settings

amMistic/VarDaan.ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

image

VarDaan.ai 🤖

VarDaan.ai is an AI-powered platform that transforms any web content—be it blogs, articles, YouTube videos, or PDF documents—into a chatbot that users can interact with. By providing a URL or uploading a file, you can ask natural language questions about the content and receive accurate, contextually relevant answers in real-time.

Features

  • Web Content as Chatbots: Turn any web page, blog, or article into an interactive chatbot.
  • YouTube Video Chatbot: Input a YouTube video URL and ask questions based on video content.
  • PDF Support: Upload PDF files and generate an AI chatbot to query the content.
  • Conversational AI: Powered by advanced natural language processing models that provide accurate answers based on the context.
  • Simple User Interface: Easy-to-use web interface that allows users to engage with different types of content seamlessly.

Table of Contents

Installation

Prerequisites

Make sure you have the following installed:

  • Python 3.8+
  • pip (Python package installer)
  • Git

Steps to Install and Run VarDaan.ai Locally

  1. Clone the Repository
    Open your terminal and run the following command to clone the repository:

    git clone https://github.com/amMistic/vardaan.ai.git
  2. Navigate to the Project Directory
    Move into the project folder:

    cd vardaan-ai
  3. Install Dependencies
    Install the required Python packages by running:

    pip install -r requirements.txt
  4. Set Up Environment Variables
    Create a .env file in the root directory and add the necessary environment variables (like API keys). Example:

    PINECONE_API_KEY=<your_pinecone_api_key>
    HUGGINGFACE_API_TOKEN=<your_api_token>
  5. Run the Application
    Start the VarDaan.ai application using Streamlit:

    streamlit run app.py
  6. Access the App
    Open your web browser and navigate to the local server link provided by Streamlit (usually http://localhost:8501).

Usage

  1. Web Interface: Once the app is running, you will see a simple input field for URLs or file uploads (PDFs).
  2. Enter Content:
    • For blogs/articles: Enter the URL of the blog or article.
    • For YouTube videos: Enter the YouTube video URL.
    • For PDFs: Upload the PDF document directly into the app.
  3. Interactive Chat: After processing the content, you can ask questions in the chat interface, and VarDaan.ai will respond based on the content provided.

Example Commands (Future Task)

  • Blog/Article: vardaan.ai.http://example.com/blog-post
  • YouTube Video: vardaan.ai.youtube.com/watch?v=example-video
  • PDF: Drag and drop a PDF document into the interface.

Project Structure

vardaan-ai/
│
├── app.py                   # Main application entry point
├── src/                     # Source code
│   ├── Online_src/           # Web content and YouTube processing
│   ├── Offline_src/          # PDF handling
│   ├── Handle_user.py        # Handles user queries and responses
│   ├── embedding_model.py    # Embedding logic for vector storage
├── vecDatabase/             # Stores vectorized representations of content
├── requirements.txt         # List of Python dependencies
└── README.md                # Project documentation

How It Works

image

  1. Extracting Content:

    • Web Content: VarDaan.ai fetches and processes text from web pages or articles using web scraping methods.
    • YouTube: The app uses YouTube's transcript API to extract the spoken text from videos.
    • PDF: For PDF files, VarDaan.ai extracts the textual content and splits it into manageable chunks.
  2. Processing & Storage:

    • The content is split into smaller text chunks.
    • Each chunk is embedded using a pre-trained NLP model, converting the text into a vector format.
    • These vectors are stored in a vector database (Chroma) for efficient querying.
  3. Conversational Queries:

    • When the user asks a question, VarDaan.ai retrieves relevant information from the vector store.
    • The retrieval system uses advanced language models to generate appropriate, context-aware responses.

Tech Stack

  • Python: Core programming language
  • Streamlit: Web interface framework
  • LangChain: Used for managing document chains and information retrieval
  • Chroma: For vector storage and search capabilities
  • Hugging Face Models: Provides the embeddings for text representation
  • YouTube Transcript API: Used to fetch video transcripts
  • pdfplumber: For handling PDF text extraction

Contributing

I welcome contributions from the community! To get started:

  1. Fork the repository.
  2. Create a new branch:
    git checkout -b feature/your-feature-name
  3. Make your changes and commit them:
    git commit -m "Add a new feature"
  4. Push to the branch:
    git push origin feature/your-feature-name
  5. Create a pull request on GitHub.
    git pull origin feature/your-feature-name

About

VarDaan.ai is a chatbot interface allow online entities—like blogs, articles, or youtube videos—into an interactive chatbot. Users can simply pass interested entity's URL, and the app processes the content to create a chatbot that answers queries based on the material, allowing for easy interaction without reading or watching the full content.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages