Multimodal Chatbot with RAG, Memory, and Streamlit

This project is a multimodal AI chatbot built with Streamlit, LangChain/LangGraph-style agents, Pinecone RAG, Whisper transcription, SQLite document registry, and LangSmith tracing. It supports:

Audio, text and pdf multi-file ingestion
Automatic transcription (Whisper)
Chunking & summarization for RAG
Pinecone vector storage & retrieval
Time-based queries on audio
Page-based queries for text
Multi-turn memory via thread IDs
Per-user document registry using SQLite
A clean Streamlit UX

🚀 Key Features

🔊 Audio Processing Pipeline

When a user uploads an audio file (mp3, wav, m4a):

Transcription using Whisper
Chunking of transcript by timestamps
Summarization of entire transcript
Chunks + summary stored in Pinecone
Document metadata stored in SQLite (user_id, filename, summary, type)
Context returned to the chat agent

📚 Retrieval (RAG)

Vector search using Pinecone
Supports metadata filters (start time, end time, page, source)
Metadata-only queries use a neutral placeholder vector

🧠 Multi-turn Memory

Implemented via config={configurable: {thread_id: "..."}}
File uploads insert contextual system messages into the conversation

💾 SQLite Document Registry

Each uploaded file is recorded in a local SQLite database.

User ID
File source
File type
Summary
Timestamp

Custom tools allow the agent to:

List a user's uploaded documents
Retrieve RAG data

🧪 LangSmith Integration

Full tracing enabled
Automatic logging of pipeline execution, chain calls, and tool invocations

🪧 Demo

--> Click image

📁 Project Structure

├── app/
│   ├── .streamlit/
│   │   ├── config.toml
│   │   └── secrets.toml
│   └── chatbot.py          # Main streamlit app
├── data/                   # SQLLite database for document registry
│   └── user_data/
│       ├── documents.db
│       └── setup.ipynb
└── src/
    ├── __init__.py
    ├── router.py
    ├── agent/              # Agent setup including tools, query and templates
    │   ├── __init__.py
    │   ├── create.py
    │   ├── prompt_templates.py
    │   ├── queries.py
    │   └── tools/
    │       ├── pinecone_retrival.py
    │       └── sql_retrival.py
    ├── data_storage/       # Interaction with SQLLite database
    │   ├── add_document.py
    │   ├── delete_documents.py
    │   ├── delete_recods.py
    │   └── list_documents.py
    ├── pipelines/          # Orchestrator pipelines for uploaded input
    │   ├── __init__.py
    │   ├── audio_pipeline.py
    │   ├── pdf_pipeline.py
    │   └── text_pipeline.py
    ├── processing/         # Processing functions for uploaded input
    │   ├── __init__.py
    │   ├── audio.py
    │   ├── chunking.py
    │   └── summarize.py
    └── rag/                # Pinecone interaction for storing and retrival
        ├── __init__.py
        ├── base.py
        ├── build_records.py
        ├── delete.py
        └── retrieval.py

⚙️ Setup

1️⃣ Clone & install

git clone <repo-url>
cd multimodal-chatbot
pip install -r requirements.txt

2️⃣ Add your secrets to Streamlit

.streamlit/secrets.toml:

[openai]
api_key = "YOUR_OPENAI_KEY"

[langsmith]
api_key = "YOUR_LANGSMITH_KEY"
project = "your-project-name"
endpoint = "https://api.smith.langchain.com"
tracing = "true"

[pinecone]
api_key = "Your_PINECONE_KEY"

3️⃣ Run the app

streamlit run app.py

🗺️ Roadmap

Add login system for per-user persistent documents
Add chat history

🤝 Contributing

PRs are welcome! If you want help restructuring the code, adding tests, or extending the pipeline, feel free to open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
app		app
data/user_data		data/user_data
src		src
.gitignore		.gitignore
README.md		README.md
interface.jpg		interface.jpg
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Chatbot with RAG, Memory, and Streamlit

🚀 Key Features

🔊 Audio Processing Pipeline

📚 Retrieval (RAG)

🧠 Multi-turn Memory

💾 SQLite Document Registry

🧪 LangSmith Integration

🪧 Demo

📁 Project Structure

⚙️ Setup

1️⃣ Clone & install

2️⃣ Add your secrets to Streamlit

3️⃣ Run the app

🗺️ Roadmap

🤝 Contributing

About

Uh oh!

Releases

Packages

Languages

martijnooo/multimodal-chatbot

Folders and files

Latest commit

History

Repository files navigation

Multimodal Chatbot with RAG, Memory, and Streamlit

🚀 Key Features

🔊 Audio Processing Pipeline

📚 Retrieval (RAG)

🧠 Multi-turn Memory

💾 SQLite Document Registry

🧪 LangSmith Integration

🪧 Demo

📁 Project Structure

⚙️ Setup

1️⃣ Clone & install

2️⃣ Add your secrets to Streamlit

3️⃣ Run the app

🗺️ Roadmap

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages