Skip to content

Turn your Telegram bot into a smart, no-code AI assistant that answers questions about any uploaded PDF. Powered by Google Gemini + Supabase embeddings + n8n automation. Includes post-processing for Telegram-safe HTML and message splitting.

License

Notifications You must be signed in to change notification settings

mohamadghaffari/telegram-doc-ai-assistant

Repository files navigation

🤖 Telegram AI Assistant for Your Documents (n8n + Supabase + Gemini)

This project transforms a standard Telegram bot into your dedicated AI assistant – designed to understand and answer questions based on your own documents. It seamlessly integrates the power of Google Gemini for advanced language capabilities and Supabase's vector database for efficient, intelligent document retrieval. Built entirely within the no-code platform n8n, it allows you to deploy a sophisticated document chatbot without writing a single line of code.

Simply upload any PDF document to the bot, and instantly gain the ability to chat with it, querying its contents as if it were a knowledgeable expert on your uploaded files.


📹 Watch the Bot in Action

𝗨𝗻𝗹𝗲𝗮𝘀𝗵𝗶𝗻𝗴 𝗔𝗜 𝗼𝗻 𝗠𝘆 𝗕𝗼𝗼𝗸𝘀𝗵𝗲𝗹𝗳: 𝗙𝗹𝗼𝘄 𝗣𝗿𝗼𝗴𝗿𝗮𝗺𝗺𝗶𝗻𝗴 𝗣𝗼𝘄𝗲𝗿𝘀 𝗮 𝗡𝗲𝘅𝘁-𝗟𝗲𝘃𝗲𝗹 𝗧𝗲𝗹𝗲𝗴𝗿𝗮𝗺 𝗕𝗼𝘁🤖

▶️ Click the image above to watch a live demo on YouTube.

This video provides a live demonstration of the bot's core features and how it interacts. See a quick walkthrough of its capabilities and user flow.


✨ Ignite Your Workflow: Use Cases

This project empowers two core interactions:

1. Conversational AI Interface (User Inquiry → Telegram Bot → Intelligent Answers)

  • Users pose questions directly to the Telegram bot.
  • The bot generates relevant, informative answers using the cutting-edge capabilities of the Google Gemini LLM.
  • Leveraging a powerful vector search mechanism, it can pull specific, contextual information from previously uploaded documents to provide highly relevant and informed responses.
  • (Optional) Augment answers with real-time data, like current weather information.

2. Effortless Document Integration (User Upload PDF → Processing → Searchable Knowledge)

  • Users upload a PDF document directly to the bot.
  • The workflow automatically parses the document content, converts it into numerical representations called embeddings using Gemini's embedding models.
  • These embeddings, alongside the document's text content, are then securely stored in a dedicated Supabase vector table, creating a searchable knowledge base.
  • Immediately after successful processing, the document becomes part of the bot's memory, enabling users to ask questions about its contents via the standard chat interface.

🧠 Core Intelligence Features

  • Pure No-Code: Developed and managed entirely within the intuitive n8n automation platform.
  • 📄 Seamless PDF Integration: Easily upload and process PDF documents to expand the bot's knowledge.
  • 🧠 Powered by Google Gemini: Utilizes Gemini for both generating document embeddings and formulating intelligent conversational responses.
  • 🗂 Vector Database Memory (Supabase): Employs Supabase as a robust vector database for storing and efficiently searching document embeddings, providing the bot with long-term memory about your content.
  • ⚡️ Rapid & Private Retrieval: The vector search allows for swift identification and retrieval of the most relevant document snippets based on the user's query. This approach enhances response speed and significantly improves data privacy, as the original document content remains securely stored in your Supabase instance, and only the user's query and the retrieved relevant chunks are sent to the LLM for generating a response.
  • 🧹 Intelligent HTML Post-processing: Cleans the LLM's responses by removing HTML tags not supported by Telegram while preserving essential formatting and correctly escaping special characters in the text content.
  • 📤 Adaptive Message Chunking: Splits lengthy AI-generated answers into multiple messages that adhere to Telegram's 4096-character limit, ensuring the full response is delivered cleanly.
  • 🌦️ Dynamic Weather Data: (Optional) Integrates with OpenWeatherMap to provide current weather information upon request.
  • 📝 Note on Usage: This workflow is designed primarily for personal, single-user scenarios. It processes each message independently and does not include multi-user session management, making it unsuitable for public deployment where different users require separate conversational contexts. For a session-based Telegram bot implemented in Python, you may refer to this project, which is a multi-model telegram bot: https://github.com/mohamadghaffari/gemini-tel-bot.

🛠 Getting Started: Setup

You can obtain and deploy this workflow in two ways:

  1. Directly from the n8n Template Library: Get the workflow directly from the official n8n template page:

    https://n8n.io/workflows/3940-document-qanda-chatbot-with-gemini-ai-and-supabase-vector-search-for-telegram/

    Clicking "Use this workflow" on that page will open it directly in your n8n instance, ready for configuration.

  2. Importing from this Repository: Alternatively, clone or download this repository to get the necessary files:

    • telegram-pdf-ai-assistant.json: The complete n8n workflow export.

    • README.md: This guide. Then, access your local or hosted n8n instance, navigate to WorkflowsImport from File → select telegram-pdf-ai-assistant.json.

Connect Your Services: Configure Credentials

Create API credentials for the following services within your n8n instance:

Service          Purpose                         
Telegram API      Receiving user messages & sending replies
Google Gemini    Generating embeddings & LLM responses
Supabase          Storing & searching document vectors 
OpenWeatherMap    (Optional) Fetching weather data   

Prepare Your Supabase Knowledge Base

Set up a vector-enabled table in your Supabase project to store your document embeddings. Execute the following SQL commands in your Supabase SQL Editor:

-- Enable the pgvector extension to work with embedding vectors
create extension vector;

-- Create a table to store your documents and their embeddings
create table user_knowledge_base (
  id bigserial primary key,
  content text, -- Stores the text chunk from the document
  metadata jsonb, -- Stores document information (e.g., filename, page number)
  embedding vector(768) -- Stores the vector representation (embedding) generated by Gemini. Adjust dimension if using a different model.
);

-- Create a function to perform vector similarity search against your documents
create function match_documents (
  query_embedding vector(768),
  match_count int default null,
  filter jsonb DEFAULT '{}'
) returns table (
  id bigint,
  content text,
  metadata jsonb,
  similarity float
)
language plpgsql
as $$
#variable_conflict use_column
begin
  return query
  select
    id,
    content,
    metadata,
    -- Calculate cosine similarity: 1 - cosine distance (using the '<=>' operator provided by pgvector)
    1 - (user_knowledge_base.embedding <=> query_embedding) as similarity
  from user_knowledge_base
  where metadata @> filter -- Optional: filter results based on metadata
  order by user_knowledge_base.embedding <=> query_embedding -- Order by similarity (closest first)
  limit match_count; -- Limit the number of results
end;
$$;

This sets up the necessary table and a function to perform vector similarity searches, allowing you to find document chunks most similar to a user's query.


💬 How it Works: The Interaction Flow

The process from a user's query to an intelligent answer is a seamless orchestration:

[User Sends Query via Telegram]
           ↓
[n8n Workflow Triggered]
           ↓
[User Query Processed] --> [Gemini Embeddings: Query is converted to a vector]
           ↓
[Supabase Vector Search: Find relevant document chunks based on query embedding]
           ↓
[n8n Combines: Original Query + Retrieved Document Chunks (Context)]
           ↓
[Google Gemini LLM: Generates Answer based on Query and Context]
           ↓
[n8n Post-processing: Cleans HTML formatting & Chunks Message for Telegram]
           ↓
[Telegram Bot Sends Answer Message(s)]

This flow leverages the speed of vector search in Supabase to quickly find pertinent information without needing to read entire documents. By sending only the user's query and the most relevant document chunks to Gemini, it enhances data privacy compared to solutions that might process full document contents with the LLM.


🖼 Workflow Visual

A glimpse into the n8n workflow automation: Workflow Screenshot


📚 Integrated Technologies

This project brings together powerful tools:


📄 License

This project is released under the MIT License – feel free to use, modify, and distribute.


🙌 Inspiration

This project was developed as part of an ongoing exploration into creating practical, intelligent AI agents using accessible no-code platforms and cutting-edge AI technologies.

About

Turn your Telegram bot into a smart, no-code AI assistant that answers questions about any uploaded PDF. Powered by Google Gemini + Supabase embeddings + n8n automation. Includes post-processing for Telegram-safe HTML and message splitting.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published