This project provides a tool to convert any PDF document into a engaging podcast! Using Google's Gemini for dialogue generation and Elevenlabs for text-to-speech.
Here's what the app looks like:
- Understand PDFs: Understand PDFs rather than extract text. (It's can understand figure, table, images....)
- Dialogue Generation: Uses Gemini to generate conversational dialogues based on the input PDF content.
- Text-to-Speech: Converts the generated dialogues into audio using Elevenlabs' text-to-speech service.
- Streamlit UI: Provides an easy-to-use interface to upload PDFs and generate podcasts.
-
Clone the Repository
Clone the project to your local machine:git clone https://github.com/chiragjoshi12/pdf-to-podcast.git cd pdf-to-podcast
-
Install Dependencies
Install the required Python dependencies:pip install -r requirements.txt
-
Set Up API Keys
Create a.env
file and add your Gemini and Elevenlabs API keys:GEMINI_API_KEY="YOUR_GEMINI_API_KEY" ELEVENLABS_API_KEY="YOUR_ELEVENLABS_API_KEY"
-
Run the Application
Start the application with Streamlit:streamlit run main.py
The app will be available in your browser for use.
- Upload a PDF file to the app.
- Set your podcast prompt and click "Generate Podcast."
- The app will generate the dialogue and convert it into an audio file, which you can listen to.
For detailed instructions and insights, check out the blog post:
NotebookLM with Gemini and Elevenlabs (Detailed Documentation)
Readme made with 💖 using README Generator by Chirag Joshi