RAG-Powered Website Chatbot

A cost-optimized chatbot that ingests a website URL, recursively crawls same-domain pages, builds a local FAISS vector index, and answers questions using Retrieval-Augmented Generation.

The project uses AWS Bedrock where it matters most:

Amazon Titan Embeddings for converting website chunks into vectors
Groq LLaMA (llama-3.3-70b-versatile) for grounded answer generation
DynamoDB for optional conversation session memory

It deliberately avoids managed Bedrock knowledge bases, object-storage ingestion, and managed search infrastructure to prevent idle infrastructure cost.

Demo UI Screenshot

The interface features a real-time 3D knowledge graph showing crawled pages as interconnected nodes, live confidence scoring, and per-namespace chat isolation.

Architecture

Crawler -> Titan Embeddings (Bedrock) -> FAISS -> Groq LLaMA -> Answer

Website URL
    |
    v
Recursive crawler
    |
    v
Clean text + chunk content
    |
    v
Titan Embeddings via Bedrock
    |
    v
Local FAISS index
  faiss_index/{namespace}/index.faiss
  faiss_index/{namespace}/index.pkl
    |
User question
    |
    v
Titan query embedding -> FAISS search -> Groq LLaMA -> Answer
    |
    v
Answer + confidence + source citations

Features

Feature	Details
Recursive crawler	Follows same-domain links up to configurable depth and page limits
Local vector store	Uses FAISS files on disk instead of managed search infrastructure
Bedrock embeddings	Uses `amazon.titan-embed-text-v2:0`
Groq generation	Uses Groq LLaMA (`llama-3.3-70b-versatile`)
Source citations	Returns source URLs and relevant chunks with each answer
Confidence scoring	Labels answers as HIGH, MEDIUM, LOW, or FALLBACK
Session memory	Stores multi-turn chat history in DynamoDB
Multi-site support	Namespaced FAISS indexes allow multiple sites to be ingested simultaneously
SSE streaming	`POST /chat/stream` streams answer tokens in real time for a ChatGPT-like experience
Safer crawling	Blocks private and loopback IP ranges to reduce SSRF risk
3D Knowledge Graph	React Three Fiber visualisation showing crawled pages as nodes; cited pages glow after each answer
Prompt injection detection	Blocks role override attempts, system prompt extraction, and jailbreak phrases at the API layer
FAISS poisoning prevention	Validates every chunk before embedding - rejects injected instructions and adversarial content

Tech Stack

Layer	Technology
API	FastAPI + Uvicorn
Crawler	httpx
Embeddings	Amazon Titan Embeddings via Bedrock
LLM	Groq LLaMA (`llama-3.3-70b-versatile`)
Vector search	FAISS local index
Session memory	DynamoDB
Frontend	React 19 + Vite + React Three Fiber + Tailwind CSS
Security	Custom prompt injection detector + chunk content validator
Testing	pytest (35 tests)

Cost Optimization

Component	Service	Demo Cost
Crawling	Local Python/httpx	$0
Embeddings	Titan Embeddings via Bedrock	~$0.00002 per 1K tokens
Vector store	Local FAISS files	$0
Generation	Groq LLaMA	$0
Session memory	DynamoDB free tier/pay-per-request	~$0 for demo usage

The crawler output is embedded immediately and saved to local FAISS files. This keeps vector storage and generation free while still showing practical AWS Bedrock integration for embeddings.

Security

Feature	Details
Prompt injection detection	Detects role override attempts (`ignore previous instructions`, `act as`, `jailbreak`), system prompt extraction (`reveal your prompt`, `show your system prompt`), and known jailbreak phrases (`DAN mode`, `developer mode enabled`) before the message reaches the LLM
FAISS index poisoning prevention	Validates every chunk before embedding - rejects injected instructions, excessive repetition, low information density, high special character ratio, and known jailbreak phrases
SSRF protection	Crawler blocks private IP ranges (10.x.x.x, 192.168.x.x, 127.x.x.x) to prevent server-side request forgery
Message sanitization	Strips null bytes, collapses whitespace, removes non-printable characters, truncates to 4000 characters

Prompt injection attempts return HTTP 400 with a structured error response. The frontend displays a red security alert bubble instead of passing the message to the LLM.

Prerequisites

Python 3.11+
AWS credentials with Bedrock Runtime access
Bedrock model access enabled for amazon.titan-embed-text-v2:0
Groq API key for llama-3.3-70b-versatile
Node.js 18+ (for the frontend)
Optional: DynamoDB permission if you want persistent session memory

Backend Setup

git clone https://github.com/labeebkm/cost-optimized-rag-website-chatbot
cd rag-website-chatbot
pip install -r requirements.txt
cp .env.example .env

Edit .env with your AWS credentials and Groq settings:

GROQ_API_KEY=your-groq-api-key
GROQ_MODEL_ID=llama-3.3-70b-versatile
AWS_ACCESS_KEY_ID=your-access-key-id
AWS_SECRET_ACCESS_KEY=your-secret-access-key
AWS_REGION=us-east-1

Optional: create the DynamoDB session table:

python scripts/setup_aws.py

Frontend Setup

The 3D chat UI is a React/Vite app located in the frontend/ folder.

cd frontend
npm install

Run (development)

npm run dev

Open http://localhost:5173 in your browser.

Build (production)

npm run build

Run

Start the backend:

uvicorn app.main:app --reload --port 8080

Open the API docs:

http://localhost:8080/docs

API Usage

Ingest a Website

POST /ingest
Content-Type: application/json

{
  "url": "https://docs.python.org/3/",
  "max_pages": 15,
  "max_depth": 2
}

Example response:

{
  "job_id": "ingest-1715356800",
  "status": "complete",
  "message": "Crawled 12 pages from https://docs.python.org/3/. Indexed 48 chunks into namespace 'docs.python.org'.",
  "url": "https://docs.python.org/3/",
  "namespace": "docs.python.org",
  "pages_crawled": 12,
  "chunks_indexed": 48,
  "index_path": "faiss_index/docs.python.org/index.faiss"
}

Chat

POST /chat
Content-Type: application/json

{
  "message": "What is a Python decorator?",
  "namespace": "docs.python.org",
  "session_id": null
}

Example response:

{
  "response": "A decorator is ...",
  "session_id": "sess_a1b2c3d4e5f6",
  "confidence": 0.87,
  "confidence_label": "HIGH",
  "source_chunks_used": 4,
  "sources": [
    {
      "text": "Relevant website chunk...",
      "score": 0.91,
      "source_url": "https://docs.python.org/3/..."
    }
  ],
  "fallback_used": false
}

Streaming Chat

POST /chat/stream
Content-Type: application/json

{
  "message": "What is a Python decorator?",
  "namespace": "docs.python.org",
  "session_id": null
}

Returns Server-Sent Events (SSE). First event contains metadata, subsequent events stream answer tokens, final event signals completion:

data: {"type": "meta", "session_id": "sess_abc123", "confidence": 0.87, "confidence_label": "HIGH", "sources": [...]}

data: {"type": "token", "text": "A "}
data: {"type": "token", "text": "decorator "}
data: {"type": "token", "text": "is ..."}

data: {"type": "done"}

Prompt Injection Blocked (HTTP 400)

{
  "error": "prompt_injection_detected",
  "detail": "Message contains patterns associated with prompt injection attacks.",
  "reason": "role override attempt detected: ignore previous instructions"
}

Confidence Score

The confidence score is calculated from the top retrieved FAISS matches:

Label	Score Range	Meaning
HIGH	>= 0.70	Strong match in crawled website content
MEDIUM	0.50-0.69	Partial but useful match
LOW	0.35-0.49	Weak match
FALLBACK	< 0.35	No reliable context found - refuses to answer rather than hallucinate

Project Structure

rag-website-chatbot/
  app/
    main.py
    config.py
    core/
      security.py         <- prompt injection detection + message sanitization
      exceptions.py
      logging.py
    routes/
      ingest.py
      chat.py
    services/
      crawler_service.py
      bedrock_service.py
      vector_store_service.py
      rag_service.py
      session_service.py
    models/
      schemas.py
  frontend/
    src/
      components/
        WebChatAI.tsx
        KnowledgeGraph3D.tsx
        KnowledgeGraph2D.tsx
        Scene3D.tsx
      lib/
        api.ts
  screenshots/
    rag-website-chatbot-ui.png
    architecture-diagram.png
  scripts/
    setup_aws.py
  tests/
    test_main.py          <- 35 tests
  requirements.txt

Run Tests

pytest tests -q

Expected output: 35 passed

Demo Flow

Start the backend: uvicorn app.main:app --reload --port 8080
Start the frontend: cd frontend && npm run dev
Open http://localhost:5173
Paste a public URL and click Ingest - watch the 3D knowledge graph populate with crawled pages as nodes
Ask a factual question - observe the HIGH/MEDIUM confidence score and source citations
Watch the cited nodes pulse/glow on the knowledge graph
Ask a follow-up question - session memory carries context automatically
Ask an off-topic question - observe the FALLBACK response (no hallucination)
Switch to a second ingested site from the sidebar - previous chat history is preserved
Type "ignore previous instructions" - observe the red security alert bubble blocking the prompt injection attempt

Limitations and Future Improvements

FAISS index is local to the machine running the API - a production deployment would use S3 + EFS for shared storage
Current crawler focuses on HTML and plain text - PDF and table extraction can be added
Rate limiting (slowapi) can be added to prevent API abuse
A managed vector database (Pinecone, Weaviate) can replace local FAISS for multi-instance deployments

Author

Labeeb K M

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG-Powered Website Chatbot

Demo UI Screenshot

Architecture

Features

Tech Stack

Cost Optimization

Security

Prerequisites

Backend Setup

Frontend Setup

Run (development)

Build (production)

Run

API Usage

Ingest a Website

Chat

Streaming Chat

Prompt Injection Blocked (HTTP 400)

Confidence Score

Project Structure

Run Tests

Demo Flow

Limitations and Future Improvements

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
app		app
frontend		frontend
screenshots		screenshots
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
architecture.html		architecture.html
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

RAG-Powered Website Chatbot

Demo UI Screenshot

Architecture

Features

Tech Stack

Cost Optimization

Security

Prerequisites

Backend Setup

Frontend Setup

Run (development)

Build (production)

Run

API Usage

Ingest a Website

Chat

Streaming Chat

Prompt Injection Blocked (HTTP 400)

Confidence Score

Project Structure

Run Tests

Demo Flow

Limitations and Future Improvements

Author

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages