https://gemini-transcribe.fly.dev/
A web application for transcribing audio and video files using Google's Gemini Flash model.
Flash is a very interesting model to explore for audio transcription because:
- We can prompt for specific transcription outputs, as it processes both audio and text inputs
- It has built-in speaker diarization
- It can attempt to detect not only words but also silence, sentiment, and sounds beyond human voices
- It can translate the transcription