Skip to content

ClipABit/audio-transcription

Repository files navigation

Audio Transcription Demo (ClipABit Demo 1)

A simple Streamlit app displaying the audio transcription to search portion of the pipeline for ClipABit -- a semantic search engine for video editors.

How to run it on your own machine

  1. Install the requirements

    $ pip install -r requirements.txt
    
  2. Run the app

    $ streamlit run streamlit_app.py
    

Audio Transcription Embeddings & Search Algorithms Demo

Project Purpose

This project demonstrates how to generate audio transcription embeddings using OpenAI Whisper, and explores different search algorithms for retrieval tasks. The goal is to experiment with various fusion and selection strategies for searching among multimodal embeddings.

Workflow

  1. Audio Transcription & Embedding

    • Upload audio files via the Streamlit app.
    • Transcribe each audio file using Whisper.
    • Generate embeddings for each transcription.
  2. Search Algorithms Tested

    • Average Fusion: Combine all modality vectors into a single index vector (or average similarity scores) and retrieve the most relevant result.
    • Tiny LLM Select: Use a small language model (or classifier) to select the primary modality based on the prompt, then retrieve from that modality's index only.
    • Tiny LLM Weighted: Use a small language model to assign weights to each modality and fuse retrieval results using those weights.

Search Algorithm Selection

Users can choose which search algorithm to use for semantic retrieval:

  • Average Fusion: Combines vectors or similarity scores for retrieval.
  • Tiny LLM Select: Uses a small language model to select the primary modality for search.
  • Tiny LLM Weighted: Assigns weights to each modality using a small LLM and fuses results accordingly.

Metrics Displayed in the UI Sidebar

The sidebar provides live data on the following metrics for each search and embedding operation:

  • Quality: Subjective measure (e.g., by virtue of having eyes, user can judge relevance).
  • Latency: Time to embed a query, search, and merge results.
  • Embedding Time: Per-clip embedding time (batch and single).
  • Storage: Database size per clip and total.
  • Computational Intensity: Resource usage for each operation.

Live Demo

🚀 Currently Running

Local Development: http://localhost:8501 (Running locally)

🌐 Cloud Deployment

Open in Streamlit (Previous demo)

Status: ✅ Always Active - Monitored by UptimeRobot to prevent hibernation

New Deployment: Coming soon - deploy this repository to get your own dedicated URL

How to Run

Option 1: Using Management Script (Recommended)

# Start the app in background
./run_streamlit.sh start

# Check status
./run_streamlit.sh status

# View logs
./run_streamlit.sh logs

# Stop the app
./run_streamlit.sh stop

Option 2: Manual Start

  1. Install dependencies:
    pip install -r requirements.txt
  2. Start the app:
    streamlit run streamlit_app.py

Option 3: Deploy to Streamlit Cloud

  1. Push to GitHub:
    git add .
    git commit -m "Deploy to Streamlit Cloud"
    git push origin main
  2. Go to share.streamlit.io and deploy!

🔄 Keeping Apps Active

Prevent Streamlit Cloud Hibernation

Streamlit Cloud hibernates apps after 12 hours of inactivity. To keep apps always running:

Option 1: UptimeRobot (Recommended)

  1. Go to uptimerobot.com
  2. Add monitor for your app URL
  3. Set 5-minute intervals
  4. App stays active forever! ✅

Option 2: GitHub Actions

Use the included workflow: .github/workflows/keep-alive.yml

Option 3: Manual Ping

Run the included script: ./ping_app.sh

Files: See UPTIMEROBOT_SETUP.md for detailed instructions.

Next Steps

  • Implement and compare the search algorithms listed above.
  • Experiment with different fusion strategies and LLMs for selection/weighting.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published