Audio Transcription & Summarization App

A full-stack application that lets you record or upload audio, transcribe it via OpenAI’s Whisper model, and generate structured summaries using Google’s Gemini API. Transcriptions and summaries are managed by a FastAPI backend and presented through a modern React frontend.

🚀 Features

Record Audio
Capture system audio via a loopback device (e.g., “Stereo Mix” on Windows).
Upload Audio
Upload local audio files (.wav, .mp3, etc.) for transcription.
Automatic Transcription
Speech-to-text powered by OpenAI Whisper.
Translation
Non-English audio is auto-translated into English before transcription.
Session-Scoped History
Only show and summarize transcriptions from your current session.
Summarization
Generate a concise, structured summary of the transcription using Google’s Gemini API.
Download Summary
Export your summary as a PDF.
Search
Full-text search within your current session’s transcripts.
Modern Frontend
Built with React for a smooth, responsive UI.
Robust Backend
Powered by FastAPI and SQLite for quick development and easy deployment.

📦 Tech Stack

Backend: FastAPI, SQLite, Uvicorn
Frontend: React (Create React App)
Transcription: OpenAI Whisper
Summarization: Google Gemini API
Languages: Python 3.8+, Node.js 16+

🔧 Installation

1. Clone the Repository

git clone https://github.com/yourusername/soundcard_testing.git
cd soundcard_testing

2. Backend Setup

cd backend
python -m venv venv
# Activate the venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate

pip install -r requirements.txt

Create a .env file in backend/ with your Gemini API key:

GEMINI_API_KEY=your_gemini_api_key_here

Start the backend server:

uvicorn main:app --reload

The backend will be available at:
http://127.0.0.1:8000

3. Frontend Setup

Open a new terminal and run:

cd frontend
npm install
npm start

The frontend will be available at:
http://localhost:3000

⚙️ Usage

Start the Backend:
```
uvicorn main:app --reload
```
Start the Frontend:
```
npm start
```
Open the App:
Navigate to http://localhost:3000 in your browser.
Interact:
- Click Record to capture system audio.
- Or Upload any audio file.
- View live transcription (auto-translated if non-English), then generate a summary.
- Download the summary PDF or search within your session’s transcripts.

📄 Requirements

Python 3.8 or higher
Node.js 16 or higher
A system audio loopback device (e.g., “Stereo Mix” on Windows)
A valid Google Gemini API key
Internet access (to download the Whisper model on first run)

📝 Notes

Session-only Data: Only transcriptions made during the current server run are shown. Restarting or deleting the SQLite file clears history.
Persistence: The SQLite database file (e.g., db.sqlite3) lives in backend/ and persists between runs unless manually deleted.
Loopback Audio: Ensure your OS has a loopback/mix device enabled if you want to record system audio.

🙏 Credits

📜 License

This project is licensed under the MIT License. Feel free to use, modify, and distribute!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Audio Transcription & Summarization App

🚀 Features

📦 Tech Stack

🔧 Installation

1. Clone the Repository

2. Backend Setup

3. Frontend Setup

⚙️ Usage

📄 Requirements

📝 Notes

🙏 Credits

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

SAMMILLERR/meet-notes-summary

Folders and files

Latest commit

History

Repository files navigation

Audio Transcription & Summarization App

🚀 Features

📦 Tech Stack

🔧 Installation

1. Clone the Repository

2. Backend Setup

3. Frontend Setup

⚙️ Usage

📄 Requirements

📝 Notes

🙏 Credits

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages