Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

README.md

RAG over audio files using AssemblyAI

This project builds a RAG app over audio files. We use:

  • AssemblyAI to generate transcripts from audio files.
  • LlamaIndex for orchestrating the RAG app.
  • Qdrant VectorDB for storing the embeddings.
  • Streamlit to build the UI.

A demo is shown below:

Video demo

Installation and setup

Setup AssemblyAI:

Get an API key from AssemblyAI and set it in the .env file as follows:

ASSEMBLYAI_API_KEY=<YOUR_API_KEY> 

Setup SambaNova:

Get an API key from SambaNova and set it in the .env file as follows:

SAMBANOVA_API_KEY=<YOUR_SAMBANOVA_API_KEY> 

Note: Instead of SambaNova, you can also use Ollama.

Setup Qdrant VectorDB

docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage:z \
qdrant/qdrant

Install Dependencies: Ensure you have Python 3.11 or later installed.

pip install streamlit assemblyai llama-index-vector-stores-qdrant llama-index-llms-sambanovasystems sseclient-py

Run the app:

Run the app by running the following command:

streamlit run app.py

📬 Stay Updated with Our Newsletter!

Get a FREE Data Science eBook 📖 with 150+ essential lessons in Data Science when you subscribe to our newsletter! Stay in the loop with the latest tutorials, insights, and exclusive resources. Subscribe now!

Daily Dose of Data Science Newsletter


Contribution

Contributions are welcome! Please fork the repository and submit a pull request with your improvements.